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FOIWASTING USING INTERPOLATIOm&IODELING 



BACKGROUND OF THE INVENTION 



Field of the Invention 

The present invention generally concerns techniques for predicting the value 
of a variable, such as the price of a share of stock or a commodity. More specifically, 
the present invention concerns prediction of the value of a variable based on 
predicted values for other variables. 

Description of the Related Art 
Forecasting Contests 
number of forecasting contests have b^efi conducted in the past. Such 
sis range from various wagering evente/such as Superbowl pools, to various 
hancial forecasting contests. Typicallyj^such conventional contests seek to identify 
the best predictor for the outcome of^ single event. For example, the website at 
www.investorsforecast.com allo)A^ participants to predict where the Dow Jones 
Industrial Average (DJIA) wilUSe and what the prices of certain stocks will be at the 
end of next week. The person submitting the most accurate prediction for the DJIA 
and the person submitting the most accurate prediction for an individual stock are 
each given a fixed [;rK)netary award, such as $300. Other contests in the financial 
arena typically altow participants to invest an imaginary amount of money, with the 
winner being tne person whose portfolio is the largest at the end of the contest. One 
emmr>\e o^such a c on t est can be seen a t www.rdnia: . v^l o ck ii ijrkot , com , — . 

However, the present inventors have discovered that such conventional 
contests are inadequate in the following respects. First, the rankings generated by 
such contests typically do not provide useful information for truly identifying the best 
forecasters. This is a particularly significant shortcoming with respect to financial and 
economic forecasting, in which it is very useful for third parties to have that 
information. In addition, these conventional contests often reward short-term or 
single-event thinking, and such qualities may not be the most desirable in many 
cases. Finally, partly because of such short-term and single-event thinking, partly 
because of the specific events for which predictions are solicited in such 
conventional contests, and partly because of the manner in which such conventional 
contests are typically structured, the utility of the data produced by such conventional 
contests for purposes such as combination forecasting often is sub-optimal. 
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In the fin and economic arenas, the result isinat traditionally there has 
been insufficient data upon which investors could rely in order to select investnnent 
advisors. As a result, many investors are left to select advisors based largely on 
arbitrary criteria or, in the best case, to rely on recommendations from friends. At the 
5 same time, many actual and potential investment advisors who are very capable at 
reading the market conventionally have had very little opportunity to demonstrate 
their expertise to the public, and thereby attract new clients. Similar concerns exist 
for other financial and economic experts who wish to demonstrate their expertise or 
the validity of their prediction techniques. 

10 What is needed therefore, is a contest in which the rankings and/or rewards 

are tied more closely to the forecasting characteristics that are most desirable and 
that yields a large database of information which can serve as the basis for 
comparing the predictions of different forecasters. It is also desirable that the contest 
provide data that are statistically significant and can provide the basis for a wide 

1 5 variety of combination forecasts and other statistical analyses as well as being highly 
useful for marketing purposes. 
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Prediction Input 

conventional forecastij^g^ contests, participants typically submit their 
predictions by writing, typipg^ or speaking their predictions. Most frequently, such 
redictions consist of^a^merical estimate of what the value of the predicted variable 
will be at a /Specified point in time. Thus, for instance, in the 
www. investof^f orecast.com website contest mentioned above, participants type in 
the value^f their estimates and then submit those estimates by clicking a button on 
-4he-w^bstteT 

While such prediction submission techniques are adequate for their intended 
purpose, they suffer from many shortcomings. The following examples of such 
shortcomings have been identified by the present inventors. 

First, such conventional prediction submission techniques frequently are not 
very intuitive from the participant's point of view. In particular, they often require the 
participants to digest a significant amount of information in order to translate their 
rough feelings about the way the prediction variable is likely to move into a hard 
number. This is a significant disadvantage for those participants who are very 
intuitive oriented. Moreover, to the extent such persons are prone to errors in 
processing such data when converting their rough perceptions into a hard number, 
their submitted predictions may vary from what they actually believe about the 
subject variable. 

3 
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Second, flWhg to enter numerical estimates for each prediction variable can 
be cumbersome and time-consuming. This may have the effect of limiting the 
number of variables for which participants are willing to submit predictions. 

WFfile other prediction submission techniques have been utilized, ttv^y 
typically have had very limited applicability. For example, the jA^e6site at 
.cyberskipper.com permits participants to compete in predictin^ertain sports- 
related events. One of the prediction submission technigiies utilized by this site is 
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to display a grid of possible events. The participanjsxan then click on a cell within 
the grid to designate their prediction that a^Jarticular event will occur. Thus, a 
different grid is displayed for each baseball game, with each row of the grid 
corresponding to a different baseMll player and each column corresponding to a 
different event (e.g., "runsV"mts", home run"). If a participant believes that a certain 
player will get a home^n in a game, he simply clicks on the appropriate cell to enter 
that predictioTKMs can be readily appreciated, this technique generally is limited to 
predic^gJrfinary events (i.e., will/will-not occur). In many cases, this deficiency will 
limjt^the applicability of such techniq ues to co ll ect i on of very coars ei ^ i edicl i oiis ." 

What is needed, therefore, is a more efficient and intuitive way to enter or 
submit prediction data that is applicable across a wide range of prediction events and 
that can permit participants to submit predictions with more specificity than has been 
available with conventional techniques. 



Provision of On-Line Resources 

Use of the Internet has become more and more common over the past few 
years. Similarly, the number of websites on the Internet has grown exponentially and 

25 is expected to continue to grow at a fast pace. As a result, the amount of information 
available on the Internet can be staggering. However, there is often little done to 
insure that the information provided to end users is the most relevant to those users. 

A typical website might contain advertising, as well as a certain amount of 
content. Both types of information are typically controlled exclusively by the owner 

30 of the website, possibly based loosely on some indications as to what visitors would 
like to see, or based on what advertisers might believe will be most effective. 
However, the present inventors question how good such strategies are at actually 
providing website visitors with the information that they actually want and, in any 
event, have concluded that the effectiveness of such conventional strategies must 

35 necessarily vary based on the website owner's individual skill in gauging his 
audiences desires. 
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Accord in^l^he present inventors have discoverecl that what is needed is a 
more systematic technique for providing appropriate resources to users over an 
electronic network, such as the Internet, that more accurately reflects the users' 
desires. 

5 

Financial and Economic Forecasting 

The American economy is made up of the simultaneous activities of hundreds 
of millions of participants, simultaneously buying and selling goods and services in 
the competitive economy. Probably the most famous market is the Stock Market for 

1 0 the buying and selling of corporate ownership. Each business day, millions of shares 
of stock are bought and sold at competitive prices. Prices set by the competitive 
market change as people obtain different information regarding the availability and 
demand for goods, services, and financial assets. No individual knows all the market 
conditions in advance of trying to buy or sell. Knowing what prices will be in the 

15 future could allow market participants to change the amounts at which they would 
otherwise transact (e.g., if prices are expected to increase in the near future, 
knowledgeable sellers might withhold inventory from the market place). 

Almost as long as there have been measurements of economic data, people 
have attempted to formulate forecasts of prices and economic activity by using a 

20 variety of techniques. During the past fifty years, several distinct methodologies for 
producing economic forecasts have been explored. Some of the most important 
include large-scale econometric systems, time series methods, computationally 
intensive techniques, opinion polling, and combination methods. 

Economists, mathematicians, and forecasters have spent over a century 

25 attempting to specify increasingly complex mathematical and statistical models, 
which, some believe, could allow accurate forecasting to take place. Beginning with 
economic and behavioral theory, mathematical equations representing the 
interactions of different variables with each other are hypothesized. Then, using a 
sophisticated set of econometric model identification techniques, specific numehcal 

30 values for the equations' parameters are calculated based on historical relationships 
and observed data. Examples of these models have included the DRI Model, the 
Wharton Model, and the UCLA Forecasting Project model. Such large multiple 
equation mathematical forecasting models of the economy are ever increasingly 
complex, modeling ever-finer levels of economic detail, but their very complexity 

35 often makes them inaccurate as forecasting tools. 

Some of these models can be used with fair accuracy to provide "what ir 
simulations for the economy, simulations beginning from a specific initial set of 
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economic measWfhnents and then computing the likely economic impact from 
various policy changes (e.g. tax cuts, military spending). However, to the extent that 
the starting values are not precisely measured, or that there are even ever-so-slight 
errors in the mathematical equations, the resulting forecasts can display 
5 extraordinary deviation from the values that eventually are observed in the economy. 
These problems are made worse if, for any reason, historical economic data were 
generated by a different set of relationships than are now found in the economy. In 
this regard, one wag observed that these models are so accurate, economists have 
successfully predicted 14 of the last 3 recessions. Even so, these large-scale 
1 0 economic forecasting models remain the "gold standard" for economic forecasting, 
and millions of dollars are spent each year to purchase forecasts from such systems. 

Approximately thirty years ago, a group of econometricians, predominantly of 
British origin, began to develop alternative economic prediction methods. Foremost, 
single equation models using "time series" techniques popular in engineering 
1 5 applications were found to out-predict the large multiple equation economic models. 
The development of straightforward computer programs implementing these 
forecasting techniques allowed for the rapid development of these single equation 
forecasting models. Numerous economic variables were found to be reasonably 
predictable using such techniques. These techniques have continued to advance 
20 with the development of more complicated techniques (known by acronyms such as 
"ARCH" and "GARCH"). However, these forecasting techniques are viewed with 
some suspicion by many economists and forecasters because they lead to models 
developed using empirical criteria, not models specified as the logical result of 
economic theory. Even so, single equation forecasting methods are among the most 
25 valuable tools used by technical and quantitative market analysts, and are widely 
s^applied by Wall Street "Rocket Scientists" and many practicing business forecasters. 

fother set of "Rocket Science" tools has beo6me popular during the 1 990s, 
the/^omputationally intensive" forecasting tools^ Using massive computerized 
iatabases, mathematical search algorithms ar^employed to find "black boxes" for 
forecasting. Such techniques include "neural Networks", large systems of empirically 
based equations with parameters that evolve over time. Neural networks appear to 
be used, for example, in creating the fOTecasts produced by www.forecasts.ora . 
Ideally, neural networks learn from their mistakes and self correct. Although neural 
networks are the foundation of numerous automated trading and arbitrage systems 
35 on Wall Street, in practice they sonietimes "learn" too slowly and converge on very 
localized forecasting rules, which/do not generalize well. 
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Still bein^l^eloped, but of great interest are thecomputationally intensive 
statistical pattern matching procedures. Just as the weather service locates 
historical weather patterns in their database that look like current weather patterns, 
and then base long term predictions on what the historical "next week's weather" 
5 turned out to be, some forecasters are attempting to match past patterns of 
economic and stock market data to current conditions to make long term predictions. 
These forecasters are sometimes referred to as the "Rocket Science Technical 
Forecasters". However, these techniques are in their infancy and because of sparse 
historical data may never be of more than limited use in most economic forecasting 
10 applications. 

In addition, public opinion polls and surveys have been used to forecast 
"consumer sentiment" measures and to gather data on peoples' consumption 
patterns. To some extent mirroring the data collection methods used by the 
government to estimate its official economic measures, these have demonstrated 
1 5 some ability to provide accurate forecasts of what upcoming government statistical 
releases will say. For instance, the University of Michigan Center for Social 
Research is identified with its surveyed Index of Consumer Sentiment. Other major 
blic opinion polls also routinely include questions regarding economic conditions, 
final category of forecasts, so-called "consensus forecasts", is similar to 
piniprf-poll surveys but with a key differencey^ln public opinion polls, random 
rr\ ppi^lations are sampled. In creating a cons^sus forecast, polls and surveys of 
economic and financial forecasters (and/ sometimes, published forecasts) are 
conducted. Typically, the median value across participants is the consensus 
forecast. These surveys have proverm) be quite good, generally outperforming over 
time the individual forecasters v4io are included in the panel underiying the 
consensus forecast. Consensus forecasts are regulariy conducted for corporate 
earnings, money supply ancklnterest rates, and key macroeconomic variables. For 
example, both IBES arija First Call survey stock analysts to identify expected 
corporate earnings. MMS surveys bank economists to estimate the money supply 
30 figures on the upcoming Federal Reserve H-6 reports. Blue Chip Economic 
Indicators was perhaps the first service providing median and average forecasts 
from a grodp of forecasters for general economic variables (see 
www.bluecbippubs.com ). The National Association of Business Economists 
Forecast /Survey provides at least quarterly reports on what its membership 
35 anticipafies for certain general economic variables. The Federal Reserve conducts 
simHar surveys of about 30 economic forecasters with results published regulariy in 
th,e financial press. 
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Consensi^[)recasts are an example of a broadePTbut relatively infrequently 
applied category of "combination forecasts". Combination forecasts are forecasts 
created from a group of underlying forecasts. Approximately twenty-five years ago, 
combining forecasts was an active area of econometric research and many 
5 theoretical problems were solved, including sophisticated mathematical procedures 
for determining optimally changing weights for the combinations. Although the 
consensus forecast median is a combination forecast, median forecasts usually are 
not the best combination forecasts, given the available data. However, they are 
"pretty good" combination forecasts, and can be easily calculated. 

10 The consensus forecasts require no historical information about either 

predictions or accuracy. More sophisticated forecast combinations require a 
historical track record for each forecast to be included in the combination. Once this 
track record is available, the forecasts can be analyzed into optimal combinations 
much like investments are combined into an optimal portfolio. 

15 While consensus forecasting is alive and well, it appears that the broader 

optimal forecast combination literature has been abandoned or forgotten except, 
perhaps, in a few academic strongholds. This is not surprising. At the time these 
theoretical combination techniques were being developed, the efficient market 
hypothesis was in its prime and stock market forecasts were viewed with great 

20 suspicion, if they were considered at all, by academics. Economic forecasts were 
generally produced on a monthly basis at best, and more often on a quarterly basis. 
Because virtually all computation was still done on cumbersome mainframe systems, 
often as overnight batch computation jobs, forecasts were expensive to obtain . Even 
if a large number of forecasts were available, the optimal combinations could have 

25 required more computing power than was readily available to users, just as the 
Markowitz portfolio problems were generally intractable in practice. 

Consequently, the lesson that seemed to be learned from the forecasting 
combination literature is that people get more accurate predictions if they somehow 
take an average of forecasts. Hence, demand grew for consensus forecasts based 

30 on simple surveys of forecasters, but more advanced combinations were not widely 
used due to cost, data constraints, and computational complexity. Like many 
technologies, the optimal forecast combination techniques were developed before 
the infrastructure was available to allow for their effective implementation. 

In addition, combination forecasting can be difficult to implement for a large 

35 forecasting panel over a significant period of time, largely because the makeup of the 

forecasting panel varies over time and because the frequency of participation by the 

various members of the forecasting panel cannot be adequately controlled. 

8 
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Still furth^iPn certain cases there may be insufficient forecaster participation 
to permit a combination forecast of sufficient accuracy. Also, even if an accurate 
combination forecast is generated for a variable, it may be difficult to say with any 
certainty what was the relative importance of various factors arriving at the forecast. 
5 Thus, what is needed is a more accurate forecasting methodology that 

overcomes the above shortcomings in the prior art. 

Utilization of Banner Ad Click-Through Information 

Many conventional websites include banner advertisements which also 
10 function as hyperlinks to the advertiser's website. Thus, if a website visitor is 
sufficiently interested by the advertisement, he can simply click on the advertisement 
to retrieve the advertiser's webpage and obtain more information about the particular 
product or service. Use of such banner advertisements can provide advertising 
revenue for the displaying website and additional exposure for the advertising 
15 company. 

In order to better target their advertising efforts, such advertisers might keep 
track of how many visitors to their site resulted from click-throughs for each of the 
various banner ads they have posted on others' websites. However, the present 
inventors have discovered that banner ad click-through information can be used in 
20 a wide variety of additional applications, such as further increasing the efficiency of 
advertisers' marketing efforts, predicting certain events, and others. 

SUMMARY OF THE INVENTION 
The present invention addresses the foregoing problems by providing a 
25 number of different inventive features which can be implemented individually or in 
any of a wide variety of combinations. These inventive features generally can be 
grouped according to the following categories. 

Forecasting Confest 

30 The present invention provides forecasting contests that include features 

directed to better ranki/ig of the participants and/or that result in a better database 
of prediction data. 

Thus, in oneWsd^tsMt^^e^invention is directed to conducting a contest that 
produces forecastinadata for predesignated variables whose values change over 
35 time. Initially, participant registrations are accepted, and the participants are 
permitted to submit/predictions of values, projected at plural different time points, for 
at least one of several predesignated variables. For example, an individual 
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participant miglfSBIict to predict what the exchande rate between the U.S. Dollar and 
the Japanese Yen will be at the end of next weeK and at the end of the year. Then, 
the participants receive an overall ranking basep on their relative accuracies (e.g., 
percentile rankings) in individual prediction events. 
5 By ranking individuals based on their relative accuracies in individual 

prediction events, a contest conducted according to this aspect of the invention 
permits an overall ranking within a group of participants even though the participants 
in the group might be predicting different Combinations of variables or might be 
predicting for different time horizons. At the same time, ranking based on 
10 performance in a number of different prediction events often can provide more 
meaningful rankings, for example, by elinranating many of the incentives to engage 
in strategies that may occasionally prov/de high rankings in individual prediction 
events. For instance, in conventional contests that rank based on accuracy in 
individual prediction events and recogmtion is given only to the top performers, a 
1 5 participant might have a strategic inc€ nuye^ predict relatively unlikely values rather 
than values that he actually expects tofoecur so that occasionally he will be correct 
and will be listed as a top forecaster, rafher than always ranking near the middle. 

In another aspect, the invention is directed to conducting a contest that 
produces forecasting data for predesignated variables whose values change over 
20 time. Participant registrations are/accepted, but in this aspect of the invention 
registration by a participant requires providing information regarding demographic 
characteristics of the participant. Participants are then permitted to submit 
predictions of values, projected at plural different time points, for at least one of 
certain predesignated variables./ Finally, the participants are ranked based on their 
25 track records over a predefined/period of time. In this aspect of the invention, the 
predesignated variables includeeconomic and/or financial variables, and participants 
are rewarded for updating their predictions as early as possible. 

By requiring demographic information as a condition to registration, this 
aspect of the invention can often create a more useful database of prediction data 
30 for purposes such as comfcynation forecasting. Also, rewarding participants for 
updating their predictions as early as possible can provide a fuller, more complete 
and more continuous database. Finally, as noted above, by ranking based on track 
record over a pre-determirjed period of time, single-event strategies often can be 
largely eliminated. 

35 In another aspect, Ithe invention is directed to conducting a contest that 

produces forecasting datalfor predesignated variables whose values change over 

time. Participant registrations are accepted, with participant registration including 

10 
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providing inforr^Rn regarding personal charaderist^^of the participant. The 
participants are permitted to submit predictions o^alues, projected at plural different 
time points, for at least one of certain predesignrated variables, including economic 
and/or financial variables. Then, the participants are ranked based on their track 
5 records over a predefined period of time. This ranking includes: (1) determining, for 
each participant and for each of plural prediction events in which the participant 
competed, a percentile rank in comparison to other participants who competed in the 
prediction event; (2) combining the percen/ile ranks for each participant to produce 
a raw score for the participant; and (3) ranking the participants based on the raw 
10 score for each participant. 

The ranking technique utilized/ in this aspect of the invention can be 
systematic and automatically implemented, while maintaining the above-described 
advantages of providing an overall r^nl^ng based on relative accuracies in individual 
prediction events. 

15 In a still further aspect, the ilhv^rilion is directed to conducting a contest that 

produces forecasting data for pre|lesigr|ated variables whose values change over 
time. Participant registrations arAicce[n&6^ the participants are permitted to 
submit predictions of values, projemed at plural different time points, for at least one 
of certain predesignated variables/ Tne participants then receive an overall ranking 

20 based on their track record over a pre-defined period of time and based on 
consistency of their accuracies m individual prediction events. 

By basing overall ranking on accuracy consistency in individual prediction 
events, as well as on track redord, this aspect of the invention can often provide 
better ranking information than conventional ranking techniques permit. For 

25 example, in the investment arena an important quality in judging the merit of an 
investment advisor will often/be consistency, as inconsistency typically translates 
directly into higher risk. Thus, by ranking based on a combination of accuracy and 
consistency, this aspect of tne present invention can often provide a ranking that is 
typically more meaningful to third parties, such as investors. 

30 In a still further aspect, the invention is directed to conducting a contest that 

produces forecasting data? for predesignated variables whose values change over 
time. Participant registrations are accepted, and the participants are permitted to 
submit predictions in plural different prediction events, each prediction event having 
a closing time point by wfiich final predictions must be submitted. Then, an overall 

35 ranking of the participanfe is determined based on the participants' track records in 

the prediction events ov4r a pre-defined period of time and based on how soon their 

final predictions were rrjjade before the closing time points. 

11 
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By basing^Pfe overall ranking on how soon the participants' final predictions 
were made before certain closing time points, asyflescribed above, this aspect of the 
invention often encourages earlier predictions anfd more frequent prediction updates, 
thereby providing a more complete database 0f prediction data. At the same time, 
5 participants are rewarded for discovering andyor incorporating new information into 
their predictions at the earliest possible time( with the result that the both quality of 
the prediction data and the quality of the radkings are likely enhanced. 

In a still further aspect, the invention is directed to conducting a contest that 
produces forecasting data for predesignated variables whose values change over 
10 time. Participant registrations are accepted, and the participants are permitted to 
submit predictions of values, projected at plural different time points, for at least one 
of certain predesignated variables. Tne participants also are permitted to submit 
estimates of their own uncertainty regarding their predictions. 

By permitting participants submit estimates of their own prediction 
1 5 uncertainty in the foregoing mannpr,/iiarticipants often are encouraged to participate 

ewhat less certain regarding their predictions. 
dA At the same time, the additional uncertainty 
abase, thus frequently permitting more 
ore accurate determination of other statistical 
20 indicators, and even creation of additional statistical measures, all toward the end of 
more accurately gauging the sentiments of the forecasting panel. 



uncertainty in the foregoing mann 
more frequently, even if they an 
As a result, more data are collejbt 
data enhances the prediction 
accurate combination forecasts 



Prediction Input 

The invention also addresses the above-mentioned problems in the prior art 
25 by permitting users to enter predictions graphically. 

Thus, in one aspectf the invention is directed to facilitating the entry of 
prediction data. Initially, a graph is electronically displayed, the graph including a 
historical portion that includes historical values of the variable over time and also 
including a future portion. Then, a participant is permitted to designate a point on the 
30 future portion of the graphf(e.g., by using an input device such as a mouse, a touch- 
sensitive display screen pr the like) and the designated point is converted into a 
predicted value for the variable at a realization time. 

In another aspect,iihe invention is directed to a method for entering prediction 
data for a variable. Inftially, a participant causes a graph to be electronically 
35 displayed, the graph inoluding a historical portion that includes historical values of 
the variable over time end also including a future portion. Next, the participant 
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designates a pBWf on the future portion of the^raph, the position of the point 
corresponding to the predicted value for the variable at a particular realization time 
and also corresponding to the realization tim^itself. For instance, the horizontal 
position of the point might correspond to thjfe realization time while the vertical 
5 position of the point corresponds to the prefciicted value. Finally, the participant 
enters the predicted value, such as by clickirig on an "enter'' button. 

By allowing a participant to see a graphical depiction of historical values for 
a prediction variable and then to enter a/prediction value for the variable in the 
foregoing manner, the present invention can offer a more intuitive way to enter 
1 0 prediction values than has been availabre in the prior art techniques. In addition, the 
foregoing technique can permit a partidiDant to observe and evaluate a significant 
amount of information at the same tinnj^ inat he is entering his prediction. 

Additional features of the inv^nt/oriinclude: also displaying on the same graph 
historical values for other variables; b/o)^iaing the ability to display the historical data 
15 and/or the predicted value for the p^dictten''^ variable with respect to a different 
independent variable than in the iraitial graph; displaying multiple variables on an 
initial graph in a first view (e.g., a time series view) and then permitting the participant 
to obtain a view that is a rotation of the first view (e.g., a cross-maturity comparison 
view); permitting the participant fo numerically alter the prediction after it has been 
20 entered graphically; permitting the participant to alternatively bypass the graphical 
input altogether and instead/ enter the prediction numerically; permitting the 
participant to enter, in addition to his prediction, an estimate of his own uncertainty 
regarding his prediction; permitting the participant to graph only certain ranges 
specified by the participant; permitting the participant to change scales of the graph; 
25 permitting the participant t© obtain graphs of arbitrarily requested mathematical 
transformations of historicaf and/or prediction data; permitting the participant to alter 
his predictions based on any of the foregoing different views, and even from within 
any or all of the different views; linking historical and/or current data, news, 
publications, etc. to the dursor position as it moves across the graph, so that such 
30 information is easily and conveniently available to the participant; and, lastly, 
matching the participarjt's prediction(s) to different prediction models to find the 
closest model, and thereafter providing the participant with Information regarding the 
model, such as the type of model, the implied assumptions in the participant's 
prediction(s), and the amount of weight the participant is implicitly applying to 
35 different items or pieces of information that underiie the identified forecasting model. 
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Any or afWr the foregoing features can tie included in the prediction input 
techniques of the present invention. All enhanceyfhe basic prediction input technique 
described above by providing the participant wrth a wide variety of different types of 
data in any of a wide variety of different formats, thus permitting each individual 
5 participant to obtain the data that are most useful to him and to display such data in 
the format(s) that are most useful to him. 

Community-Selected Content 
The present invention also addrefsses the above-described problems of 

10 providing the most useful content over an electronic network, such as the Internet. 
Generally speaking this problem is addressed in the present invention by providing 
a systematic technique for allowing us^s to participate in determining what content 
is most useful to them. 

Thus, according to one afpfc| the invention maintains a collection of 

1 5 resources that can be accessed by a participant over the electronic network (such 
as the Internet) at a given time and! typlGally upon request, provides such resources 
to the participant over the electroniqfnetwohrrToints are assigned to each resource 
based on participant access of the/resource and the collection is modified based on 
the points assigned to each resource. For instance, a fixed number of points may 

20 be assigned to each resource when a participant accesses the resource and the 
resources having the worst overall rating based on assigned points may be removed 
from the collection. Alternatively, a resource may be moved from the initial collection 
and placed in a second collection when its number of points has reached a certain 
predetermined criterion (e.g., /a fixed number or a fixed number within a set period 

25 of time). 

By assigning points afid modifying the collection in the foregoing manner, the 
present invention can proviae a systematic and automatic technique for updating a 
collection of resources ovdr an electronic network, such as the Internet. In a more 
particularized aspect of the invention, the number of points assigned to a resource 

30 when a participant accesses the resource is based upon the participation level of the 
participant. In this way,fthe participants who are most active on the network can 
have the greatest impacf on the resource collection. 

In another particularized aspect of the invention, each resource is assigned 
a score based on the points assigned to the resource, with points assigned more 

35 recently being weightecflmore heavily in determining the score than points assigned 

less recently. In this way, it can be possible to properly maintain the collection even 

in the presence of chaijiging tastes or changing consumer needs. 
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In a furtfl^spect, the invention is direfcted to^providing information to 
participants over an electronic network by maimaining a collection of resources. 
Participants are permitted to rate the resourc^ and points are assigned to each 
resource based on participant rating of the resource. The collection of resources is 
5 then modified based on assigned points for eich resource. 

In the foregoing manner, participants riave the ability to directly assess the 
usefulness of any particular resource to themfand these assessments are utilized to 
modify the collection. This can have the effect of making the resource collection 
even more responsive to the needs of the participants (or users) because, although 
10 a resource might initially appear to be valuable, upon closer inspection a user might 
find it to be inaccurate, poorly organizes or lacking for any other reason. Thus, 
allowing participant ratings and the utinzation of those ratings in the foregoing 
manner often will account for such prcfDlems. 

In a still further aspect, the irwenlion is directed to providing information to 
15 participants over an electronic netwdrk by maintaining a collection of resources. 
Participants are permitted to bothmc/ei^sarid rate the resources, with points 
assigned to each resource based m^uch ratings and access. The collection of 
resources is then modified based on total points for each resource. 

By combining point assignrnents based on both ratings and access, this 
20 aspect of the invention often typically can provide all of the benefits described above. 

Combination Forecasting Using Clusterization 

The present invention /addresses the problems with attempting to use 
combination forecasting in certain cases (such as where membership of the 
25 forecasting panel is inconsistent) by using clusterization techniques. 

Thus, in one aspect the invention is directed to providing combination 
forecasts using predictions/obtained from a group of forecasters. The forecasters 
are first divided into a numper of pre-defined clusters, which typically will have been 
formed using statistical clustering techniques. In particular, clusters of forecasters 
30 can be formed based on similarities of the forecasters' predictions. Then, statistical 
data are calculated foreaph pre-defined cluster (e.g., measures of central tendency 
and dispersion). Finally, the statistical data for all the pre-defined clusters are 
combined so as to obtain a combination forecast. 

By utilizing clustering in the foregoing manner, the present invention often can 
35 avoid the difficulties of irjconsistent forecaster participation. For instance, by utilizing 
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cluster statisticii^ often will much less sigrflificant whether or not any particular 
individual submits a forecast for a given pre)fliction event. 

The foregoing steps can be repeated for each new prediction event. For 
example, after an initial clustering with respect to a given prediction variable, each 
5 time it is desired to generate a new combination forecast for that prediction variable, 
the currently participating forecastersjtan be simply assigned to their previously 
identified clusters and then new clusleyytatistics can be determined and combined. 

When generating the combinfitior forecast, it is generally preferable to weight 
the central tendency for each cluste/ biseCf^n its dispersion measure (e.g., more 
1 0 tightly clustered predictions given mpce^veight than less tightly clustered predictions) 
and/or based on the cluster's previous prediction accuracy (e.g., clusters having 
historically better prediction accuracies are given more weight). 

It is also preferable to penodically re-cluster the forecasters to obtain a new 
set of pre-defined clusters. This often will be desirable to take account of shifting 
1 5 demographics, attitudes, social climates, economic conditions, and similar matters. 

More particularized aspects of the invention also include identifying an 
assignment formula for assigning each newforecaster to a pre-defined cluster based 
on personal characteristics of the new forecaster. This feature of the invention can 
permit additions of new forecasters in between re-clusterizations. 

20 

Forecasting Using Interpolation Modeling 

The present invention also addresses the problems of predicting variables for 
which there is insufficient forecaster participation and parsing changes in the value 
of a variable to determine the relative impact of various factors on the change. 

25 Thus, in one aspect, the invention is directed to predicting a value of a target 

variable based on predictions of other variables. This aspect of the invention 
involves obtaining historical values for the target variable at each of several time 
points and obtaining previously predicted values and currently predicted values for 
each of several predictor variables, the predictor variables being different from the 

30 target variable. Values are assigned to parameters of a forecasting model to obtain 
the best fit of the previously predicted values for the predictor variables to the 
historical values for the target variable. Finally, a value of the target variable is 
predicted from the currently predicted values for at least a subset of the predictor 
variables using the forecasting model and the values assigned to the parameters of 

35 the forecasting model. 
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By using^l^ictions of other variables in the foregoing manner, the present 
invention is often able to predict a value for a target variable for which there is 
insufficient forecaster participation. For example, there might be insufficient 
forecasters to produce a good combination forecast for the share price of a thinly 
5 traded stock. However, using predictions of other similar stocks in the foregoing 
manner, a fairly good forecast for the share price of such a stock often will still be 
possible. 

Moreover, even when there is sufficient forecaster participation, the prediction 
for the target variable produced in the foregoing manner can be compared to realized 

1 0 values of the target variable and to other predictions of the target variable (such as 
a combination forecast) in order to sort out the influences of different factors. This 
advantage is often very helpful in assessing the impact of similar factors in the future. 
For example, calculating the difference between the value of the target variable 
predicted in the above manner and the actual value realized for the target variable 

1 5 typically will provide a measure of information that is specific to the target variable. 
Similarly, calculating the difference between the value of the target variable predicted 
in the foregoing manner and the value predicted for the target variable using a 
combination forecasting technique typically will provide an estimate of expected 
information that is specific to the target variable. 

20 

Pricing Derivative Instriaments 
The present invention iaiso provides a novel technique for pricing derivative 
instruments by using forecast data. 

Thus, in one aspectjhe present invention is directed to pricing a derivative 

25 instrument whose value i^/clependent upon the value of an underiying asset at a 
future date. For each of/ar irumber of predetermined different prices, the value of a 
derivative instrument \sjda\hii\ate^ \f the underiying asset were to be priced at that 
price on a future datejMX number of individual forecasts of the value of the 
underiying asset on the future date are obtained. A probability is determined for 

30 each price, from themumber of predetermined different prices of the underiying 
asset, as the proportion of individual forecasts that were closer to that price than to 
any other of the preaetermined different prices. Finally, the derivative instrument is 
priced based on tha values calculated for the derivative instrument above and based 
on the probabilitids determined above. Preferably, the derivative instrument is priced 

35 as the sum, over the number of predetermined different prices, of the value identified 
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above for the dl^Rtive instrument if the underlying asset were priced at a given 
price on the future date, times the probability determined above for that given price. 

By virtue of the foregoing technique^ a price can be determined for a 
derivative instrument, often without the nee^ to assume a particular shape of the 
5 probability density function for the value of ine underlying asset and without having 
to rely on historical variances, which are^ften poor indicators of future variances. 

The foregoing technique can alsobe repeated for multiple time points within 
the period during which rights under tne derivative instrument may be exercised. 
The resulting multiple different pricesfcan then be combined, such as by taking a 
10 maximum of such prices, or in varioLis other manners, to determine a final price for 
the derivative instrument. 

Utilization of Banner Ad Clifck-Through Information 
The present invention provides the following novel techniques for utilizing 
1 5 banner ad click-through inforrpa^ion to predict values of variables and to manage the 
display of banner ads. 

In one aspect, the irfvehtjbn is directed to forecasting values for a variable by 
obtaining click-through dataf (y.gVcUsk-through rates or changes in click-through 
rates) for website banner acTJ^rtisements^ Initially, a forecasting model is created for 
20 a variable (e.g., using a/regression technique to create a linear or non-linear 
forecasting model), based on correlations of historical values of the click-through 
data with historical values of the variable. Then, the forecasting model is used to 
predict a future value ©f the variable. 

In the foregoirvg manner, click-through data can often be used to predict a 
25 variable. For example, it may be possible to more accurately predict new housing 
starts in part based on the click-through rate for a particular mortgage advertisement. 

In more particularized aspects of the invention, the website banner 
advertisements nriay be sorted into groups by categorizing them according to 
product/service advertised. Utilizing statistics for each such group may provide 
30 continuity while at the same time lessening the effects of changing advertisements. 
Thus, for example, new housing starts may be predicted based on the click-through 
rates for all mortgage advertisements. 

In a further aspect, the invention is directed to displaying website banner 
advertisements. The displayed website banner advertisements are sorted into 
35 categories pased on product/service sold. An individual click-through rate is 
determinedjfor each website banner advertisement and an aggregate click-through 
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rate is determineWor each catego^. Then, which website banner advertisements 
are displayed is changed based cm a comparison between information pertaining to 
the individual click-through r|i/for a selected website banner advertisement and 
information pertaining to tfreyaggregate click-through rate for the category to which 
5 the selected website bannerpidvertisement belongs. 

The foregoing tech^^ue^ften can permit the display of more effective website 
banner advertisement^ For example, if the click-through rate for a particular 
mortgage advertisement is significantly less than the click-through rate for all 
mortgage advertisements, that particular mortgage advertisement may need to be 
1 0 modified or replac| 

Comments Regarding Summarv 

The foregoing summary is intended merely to provide a quick understanding 
of the general nature of the present invention. A more complete understanding of 
1 5 the invention can only be obtained by reference to the following detailed description 
of the preferred embodiments in connection with the accompanying drawings. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates the home page of a forecasting contest according to a 
20 representative embodiment of the invention. 

Figure 2 illustrates a "Community" page of a forecasting contest according to 
a representative embodiment of the invention. 

Figure 3 illustrates a "Library" page of a forecasting contest according to a 
representative embodiment of the invention. 
25 Figure 4 illustrates a web page providing a site map of a website for a 

forecasting contest according to a representative embodiment of the invention. 

Figure 5A illustrates a display for graphically entering prediction data for two 
fime horizons according to a representative embodiment of the invention. 

Figure 5B illustrates a display for graphically entering prediction data for a 
30 single time horizon according to a representative embodiment of the invention. 

Figure 6 illustrates a display for graphically entering prediction data using a 
discrete number of prediction input buttons, according to a representative 
embodiment of the invention. 

Figure 7 illustrates a display that includes separate graphs, arranged in a 
35 stacked manner, for each of five different prediction variables, according to a 
representative embodiment of the invention. 
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Figure 8 Irostrates a display of a graph that includes data curves for five 
different prediction variables, according to a representative embodiment of the 
invention. 

Figure 9 illustrates the display of a graph showing the central tendency and 
5 dispersion data over time for predictions made by a group of forecasters. 

Figure 10 illustrates a flow diagram showing process steps for implementing 
a graphical input display, according to a representative embodiment of the invention. 

Figure 1 1 illustrates a flow diagram showing steps for generating combination 
forecasts using clusterization, according to a representative embodiment of the 
10 invention. 

Figure 12 illustrates a representative network environment in which the 
techniques of the present invention may be implemented. 

Figure 13 illustrates a representative computer system that is one of the 
suitable platforms for performing computer-executable process steps to implement 
1 5 the techniques of the present invention. 



DESCRIPTION OF THE PREFERRED EMBODIMENTS 

In the preferred embodiment of the present invention, participants from the 
general population register for and then compete in a forecasting contest. 

20 Preferably, the contest is conducted over an electronic network, such as the Internet, 
which provides immediate access to the general population. It is also preferable that 
the contest is structured not as a single contest, but rather as a collection of different 
competitions (or challenges) in which participants may elect to participate. As 
discussed in more detail below, these challenges may be either mutually exclusive 

25 or may overlap to some extent. Generally speaking, in the preferred embodiment of 
the invention participants are ranked and/or rewarded based on their track records 
over a period of time in each of the different challenges in which they participate, as 
well as on how well they do in predicting values for certain individual variables (e.g., 
individual stock or commodity prices) and how well they do in different time frames 

30 (e.g., short term, medium term, long term) both for the challenges and for the 
individual variables. This flexibility in permitting participants to select which individual 
variables to predict, which challenges to enter, and for which time frames predictions 
will be submitted often can permit identification of the best forecasters in well 
focused categories. 

35 As described in detail below, this contest structure also encourages 

participants to make the most accurate predictions possible, resulting in a highly 
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valuable databasyof forecasts. These data can then be processed in a number of 
different ways to produce useful forecast information. 

In order to facilitate predictions, participants preferably are provided with a 
variety of resources, such as Soapboxes, Archives, a "dumpster" and chat rooms, 
5 all as described in more detail below. The invention includes novel community- 
selection aspects which attempt to insure that the most relevant resources are made 
available. The invention also includes novel features for facilitating the entering of 
prediction data and for processing the prediction data to obtain more comprehensive 
combination forecasting information that is less sensitive to variations in individual 
10 participation. Finally, the invention also provides a number of novel techniques for 
utilizing banner ad click-through information. Thus, the invention includes a number 
of inventive features, and those features may be implemented individually or in any 
of a number of different combinations. These various features are discussed in 
detail below. 

15 

The Forecasting Contest 

The forecasting contest according to the present invention preferably is 
conducted over an electronic network. More preferably, the contest is conducted 
over the Internet. However, other electronic networks might be used instead of or 
20 in combination with the Internet. For example, participants might be permitted to 
enter predictions either via the Internet or via an ordinary touch tone telephone, using 
a telephone voice response system. Similarly, participants might enter predictions 
and access the other available information via an intranet and/or other local area or 
wide area networks. 

25 Figures 1 to 4 illustrate how a website implementing such a contest might be 

structured according to a representative embodiment of the invention. Specifically, 
Figure 1 illustrates a representative website homepage 2 for the contest. At the top 
of homepage 2 are a number of links, such as links 3a to 3e, to other pages of the 
website. Existing participants can log into their accounts by typing their usernames 

30 into text field 4 and then clicking username button 5; optionally, the accounts may 
be password protected so that login would require entering both a username and a 
password. New participants can register for the contest (as described in detail 
below) by clicking on the register button 6, which would pull up a registration 
webpage on which the user would enter required and optional registration 

35 information, and indicate the desired subscription level. As shown in Figure 1, 

homepage 2 also includes a link 7 to a site tour, the feature story of the day, and a 

banner advertisement 8, which typically will function as a hyperlink to the advertiser. 
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Clicking^mink 3c pulls up the Community page 9 of the websites, which is 
shown in Figure 2. This page of the site includes information primarily about the 
interactive informational content of the website. For example, portion 1 0 of the page 
includes links to the top 10 rated Soapboxes (as described below). In addition, 
5 clicking on link 11 pulls up a web page listing all of the Soapboxes with a brief 
description of each. Clicking on link 12 pulls up a web page listing available 
interactive games related to the subject matter of the contest. Clicking on link 13 
pulls up a page describing and linking to educational classes and educational 
materials related to the subject matter of the contest that are available. A different 

10 banner ad 14 is displayed at the top of Community page 9. 

Figure 3 illustrates the Library page of the contest website. This page of the 
site includes information primarily about the non-interactive informational content of 
the website. Thus, included are links to: written materials on the basics of 
forecasting 21, historical financial and economic data 22, archives of materials 

15 sponsored by the Soapbox Proprietors 23, archives of articles 24, a list of 
recommended books 25 related to the subject matter of the contest, dumpster 
matehals 26 (as described below), and press releases 27 related to the subject 
matter of the contest. Although the foregoing material itself is largely interactive, 
upon linking to the pages concerning such material, participants preferably have the 

20 ability to perform certain interactive functions, such as: searching for specific 
materials according to a variety of different criteria; keyword searching; and 
organizing and displaying financial and economic data in a variety of different 
formats (e.g., various geographical and/or tabular formats). Certain of these features 
are described in more detail below. 

25 Finally, Figure 4 illustrates the site map page 30 of the contest website. 

Specifically, this page illustrates a high-level (e.g., first and second levels only) site 
plan for the contest website. The first level links, such as links 32, are the same links 
that are displayed at the top of the homepage 2. The second level links, such as 
links 34 are to the primary links included in the first level pages. The site plan could 

30 also show deeper levels of the website, but two levels is believed to be sufficient to 
give the user an overview of the site without providing too many details, which might 
be confusing to the participant. 

The Tournament page of the website, which can be reached from link 3b or 
from link 35, for example, allows the participant to submit prediction values, view 

35 historical data, view their own previous prediction values, or views other participants' 
prediction data, all as described in more detail below. 
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In the pr^^red embodiment of the invention, the contest is open to the 
general public. As used herein, the term "general public" does not preclude certain 
relatively minor limitations, such as excluding: individuals under 18 years of age, 
individuals who cannot provide valid identification (such as a credit card number or 
5 e-mail address), or individuals or entities who cannot or will not pay to enter the 
contest. However, subject to such relatively minor limitations, the term "general 
public" is intended to encompass a wide segment of the population. By opening the 
contest to the general public, the present invention can collect a qualitatively, as well 
as quantitatively, different set of data than is the case with many conventional 
1 0 forecasting contests which limit participants to only a small group of "experts" in the 
field, such as conventional contests which limit participation only to large stock 
brokerages. 

However, it should be understood that the contest is not necessarily limited 
only to members of the general population. Rather, contests according to the 

15 invention may also be conducted for smaller and/or more focused groups of 
participants. In fact, in certain cases it may be preferable to limit participation in a 
particular contest only to members of a certain group, firm, club or trade association. 

It is also preferable that the actual participants in the contest are self-selected , 
rather than individually invited to participate. Thus, in the preferred embodiment of 

20 the invention, an individual or entity (hereafter, "person") that wishes to participate 
in the contest merely logs onto the contest website and registers. As indicated 
above, as part of the registration process the person might be required to provide 
certain minimal qualification information and/or may be required to pay a fee to 
participate (such as by providing credit card information over a secure connection). 

25 Upon verification of such qualification information, the person is then eligible to 
participate. 

Registration to participate in the contest preferably also requires the potential 
participant to provide certain information regarding personal characteristics of the 
potential participant, such as: occupation, age, place of residence, income, highest 

30 level of education obtained, schools attended, avocational interests, the dollar value 
of the potential participant's personal investment portfolio, the dollar value of the 
investment portfolio managed by the potential participant on behalf of third parties, 
trading frequency, other information relating to trading behavior, and/or various other 
demographic or personal information. In addition, some portion of the foregoing 

35 information may be required as a condition to registration while other information 

may be optionally provided by the potential participant. Potential participants may 

also be encouraged to provide the optional information by providing economic 
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incentives. Suc^mcentives may take the form of cash, merchandise, cash credits 
(hereafter, "cBucks") which can only be used to purchase services, information or 
merchandise from the entity conducting the contest or from other entities that are 
pre-approved by the entity conducting the contest, or anything else of value. 
5 Although it is contemplated that both individuals and entities may be permitted 

to participate in the contest, it might also be preferable to limit participation only to 
individuals, in order to be able to identify the true source of each prediction and to 
insure that each source remains the same over time. Thus, for example, the track 
record of a manager for a certain mutual fund could follow him even if he moved to 

10 a different fund. This may be more desirable than allowing a prediction from the 
mutual fund as an entity, in which case the actual individual providing the predictions 
may vary over time. 

Preferably, the contest allows participants to select and predict a number 
(more preferably, any number) of variables from among a set of predesignated 

15 variables. In the preferred embodiment of the invention, these predesignated 
variables have values that vary over time so that the values of those variables at a 
number of different points in time can be predicted. More preferably, the 
predesignated variables pertain to various financial and/or economic quantities, such 
as the price of a particular stock, the Dow Jones Industrial Average (DJIA), a 

20 commodity's price, the unemployment rate, the Consumer Price Index, Gross 
Domestic Product, the trade surplus/deficit, a particular interest rate benchmark, or 
a currency exchange rate. 

In the preferred embodiment, the contest also is tailored to specific groups of 
participants by allowing participants to participate in more focused games within the 

25 overall contest. These focused games are referred to herein as "challenges", and 
may be available to all participants, or some or all of the challenges may only be 
available to those having a minimum subscription level (e.g., only paying 
participants). For example, the contest might include one or more of the following 
challenges, with the predesignated prediction variables for each challenge indicated. 

30 

Stock Market Challenge 

Dow Jones Industrial Average 
Standard and Poor's 500 Index 
NASDAQ Index 
35 Wilshire 5000 Index 

Share price of Magellan Fund 
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Macroeconomic cnallenge 

Percentage Increase in Gross National Product 
Percentage Increase in Consumer Price Index (CPI-U) 
M3 money supply 
5 Unemployment Rate 

New Housing Starts 



Treasury Yield Curve Challenge 
3-month treasury bill rate 
1 0 One-year treasury bill rate 

Five-year treasury note rate 
Ten-year treasury note rate 
Thirty-year treasury bond rate 



15 International Challenge 

EAFE Index (or Dow Jones World Index) 

DollarA^en exchange rate 

Dollar/Euro exchange rate 

LIBOR Eurodollar rate 
20 Nikkei 225 (or Pacific Region Index (excluding Japan)) 



Commodity Challenge 
Gold price 

Sweet Light Crude Oil price 
25 Sphng Wheat price 

Corn price 
Coffee price 



Option Challenge (note: the five dates are within the next six months) 
30 Yahoo 150 Jan Call (and each week a different stock option) 

CBOE Dow Jones Industrial Average 
Pacific (PSE) Technology 
CBOE S&P 500 Index 
CBOE Nikkei 
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Long-term ChalTSffge (this Challenge preferably is run nnonthly for forecasts: six 
months from now, year-end from now, two year-ends from now, three year-ends 
from now, and five year-ends from now) 
Dow Jones Industrial Average 
5 NASDAQ 

Ten-year treasury note rate 
Sweet Light Crude Oil price 
EAFE Index (or Dow Jones World Index) 

10 Open Challenge (the five measures will be selected from the other Challenges) 
Dow Jones Industrial Average 
Gold price 

Nikkei 225 (or Pacific Region Index (excluding Japan) 
Ten-year treasury note rate 
15 Yahoo 150 Jan Call (and each week a different stock option) 

Within each challenge, a participant preferably may predict any number of the 

variables indicated. However, as will become apparent below, in order to be highly 

ranked within a particular challenge it may be necessary to predict as many of the 

20 variables within the challenge as is possible. However, as the rules of the contest 

preferably also contemplate ranking many or all of the variables individually, a 

participant might only care about his rank with respect to individual variables, but not 

about his rank within any challenge. Thus, for example, a participant might not care 

about his rank in the Stock Market Challenge, but might care very much about his 

25 rank as a predictor of the DJIA, and therefore would only predict that variable. In the 

preferred embodiment, participants may participate in as many challenges as they 

desire and may predict as many individual variables as they desire. 

, - AIpo, it is preferable that each participant be given the opportunity to predict 

• at ler^ some of the variables at a numb^Pt)f different time horizons. For example, 

30,. rt^rticipants in the Stock Market Challenges might have the options of predicting the 

variables included in that challenge for their closing value at the end of next week, 

^' 4 weeks from the end of nexyvjeek, 1 3 weeks from the end of next week, 52 weeks 

from the end of next week, year-end, and/or end of next year. Preferably, 

participants may prec^t, for each variable, values for as many of the available time 

35 frames as they dejslre. 

yAso in the preferred embie(fiiment of the invention, participants may enter and 

/ise their predictions as frp^uently as they like. In fact, providing new predictions 
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and revising tho^^redictions as early as poss^e^^fe encouraged with incentives. 
This differs from many conventional ^^ntests (such as the contests at 
www.eas.purdue.edu/forecast and www^redictlt.com ) and provides the advantage 
that the prediction database resisting from the contest becomes more heavily 



^5^ U3opulated and tends to includepredictions that are updated or newly submitted more 

/> Ciu '^^^ continuously, rathef^han mainly at discrete points in time. The resulting 

(^y^ database can often be/Kore useful for combination forecasts, as well as for other 

^rposes of st a t i st i ^{^ l ana l yp i er — 

However, at certain time points the predictions become locked and no further 

1 0 changes can be made for the current prediction event. Thus, for example, consider 

the case in which participants are asked to predict each day what the value of a 

financial variable, such as the DJIA, will be at the end of next week. In this case, a 

different prediction event occurs each day for that variable. Assume further that the 

contest is structured such that the closing time point for each such prediction event 

15 is 6:00 p.m. Los Angeles time. In this example, participants would be able to predict 

the value of the variable and then adjust their predictions throughout the day, but at 

6:00 p.m. Los Angeles time, all of the predictions become locked. Thereafter, any 

new predictions or changes in predictions will not be given effect for the current day's 

prediction event, but instead will only be given effect for the prediction events ending 

20 at 6:00 p.m. Los Angeles time for subsequent days. All of the locked-in predictions 

for the current day's prediction event will then be compared upon realization of the 

variable's true value as of the end of the applicable time horizon (e.g., the end of 

next week). The foregoing rules are then applied to each day's prediction event. 

In the foregoing example, only one variable and one time frame was 

25 considered. It is more preferable that participants be given the opportunity to predict 

many different variables and for multiple time frames. In this regard, the closing time 

point for each variable might occur each day at exactly the same time. However, it 

should be noted that closing time points for each variable might instead be assigned 

either arbitrarily, in a manner so as to optimize the frequency or quality of prediction 

30 data, based on empirical results, or in any other manner. In particular, it is noted that 

using a fixed closing time point for all variables might be simpler from the 

participants' point of view, but might create trafficking problems just before the 

common closing time point. Also, it might be determined, for example, that for 

certain variables it is best to set closing time points every other day or every week, 

35 rather than every day. Still further, it might be best to adjust closing time points so 

as to occur some minimum amount of time after the applicable markets close or to 

schedule the closing time points based on expected public announcements. 
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It is notec^nat where closing time points occur periodically (such as each 
day), the realization time can either be fixed (e.g., the end of next week will be the 
same for seven consecutive closing time points) or rolling (e.g., one month from 
today will be different for each closing time point). In the former case, participants 
5 generally will be predicting what the value will be at the same realization time. In the 
latter case, each participant will effectively select his own realization time, which will 
be determined based on the date and time that his prediction is made. This latter 
case may also be extended further by allowing each participant to set his own 
realization time point for each prediction made; for example, participants might, in 

10 addition to submitting a prediction, also specify when he expects that prediction to 
be valid (e.g., 3:00 p.m. on next Thursday). Also, in either case the contest might 
instead be conducted without closing time points at all, but rather so as to permit 
each participant to decide for himself the time point at which his prediction will be 
deemed effective; generally, this time point most likely would be when the prediction 

15 is actually submitted. 

In the preferred embodiment of the invention, predictions are held over from 
one prediction event to another until updated by the participant. Thus, in the 
example given above, a prediction made on Monday morning, if not otherwise 
adjusted during the day, would be used for the closing time point on Monday. If still 

20 not adjusted on Tuesday, the same prediction would be used for the closing time on 
Tuesday, and so on. 

In addition to individual participation, participants preferably are divided into 
groups based on the participants' interests, occupation or other personal 
characteristic information provided pursuant to the registration process. For ease of 

25 discussion these groups are referred to herein as "Universes". Accordingly, 
participants may be ranked only against other members of their Universe, only 
against all other participants, or may be ranked within their Universe as well as 
overall. Examples of Universes might include Stock Brokers, Soccer Moms, 
Students, College Professors, Wall Street Analysts, Journalists, and Government 

30 Economists. It may also be preferable to assign participants to sub-groups (which 
may be referred to as "teams") within each Universe or across Universes. Such 
team assignments may be made randomly, on a first-come-first-served basis (e.g., 
the first 50 registrants in the Universe are assigned to Team 1 , the next 50 to Team 
2, etc.), by self-selection among the participants, or on any other basis. Each 

35 participant participating in a Universe preferably also is asked for information and 

permission to notify the appropriate local news media if the participant is identified 

as one of the top forecasters in that Universe or other grouping. 
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Participant^nay also be given the opportunity to join "clubs". If the clubs are 
constrained to include only members of the same Universe, then the clubs are types 
of teams. However, this constraint is not essential. Each club may have its own chat 
room and/or other venues for interacting. Clubs may then be ranked against other 
5 clubs and/or rewarded based on their performances. Similarly, individual club 
participants may be rewarded based on the performance of their clubs. 

In addition to predicting actual values for certain predesignated variables, 
participants may also be asked to provide indicators concerning values for certain 
variables. For instance, one question might be whether the DJIA will be up or down 
1 0 (an up/down indicator) when comparing tomorrow's close to today's close (or to the 
value as of the time the prediction is entered). Furthermore, the usual contest 
predictions might be supplemented by providing various survey questions throughout 
the day. 

One embodiment which utilizes such additional survey questions is as follows. 

15 Participants submitting predictions are given chances to participate in a Special 
Challenge, where the number of chances is related to the number of predictions 
submitted and/or to the number of prediction updates submitted. Then, participants 
are randomly selected to participate in the Special Challenge, with the probability of 
any given participant being selected being equal to (the number of chances held by 

20 the participant)*(the total number of participants to be selected for the Special 
Challenge)/(the total number of outstanding chances). The highest ranking 
participants in the Special Challenge are then rewarded. This embodiment provides 
additional incentives for participants to provide and update their predictions as early 
as possible and also provides the entity conducting the contest with the opportunity 

25 to elicit different information over time. Such flexibility can permit the contest 
promoters to test-market questions for permanent use, to obtain highly focused 
and/or time-specific information, and/or to gather valuable marketing data. 

Other techniques may also be used to elicit responses to additional survey 
questions, such as providing either fixed or random rewards to participants who 

30 answer the questions. This latter technique might be more appropriate in cases 
where the answers are incapable of being judged as to accuracy, such as where the 
questions are attempting to elicit personal preferences. In any case, the data 
obtained from such additional survey questions can be quite valuable from a 
marketing standpoint, particularly when used in conjunction with the personal 

35 characteristic information provided by the participants. 

It is contemplated that, in the preferred embodiment of the invention, various 

levels of participation will be available to participants. For instance, persons who log 
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onto the website (or other network node) nnight only be permitted to browse the site 

for the purpose of determining what services are available and how the contest is 

played. However, in order to submit predictions a person would need to register. 

Upon registration various subscription levels would be available. In order to obtain 

5 higher subscription levels it may be necessary to pay higher fees and/or to qualify 

in some other way. For example, Basic Service might be available at no charge to 

all who register (including providing the personal characteristic information described 

above). Basic Service might entitle the participant to participate in the Open 

Challenge, use the library and Archives, access the Soapbox of the Week, and 

1 0 access all costless (e.g., 1 5 minute delay quotes) features. Many of the foregoing 

features are described in more detail below. An Advanced Service, which includes 

everything but the Premium Sites (see discussion below) and which might also 

include certain proprietary metrics relevant to the available sites, might be available 

at some charge. At a higher charge, a participant might select Premium Service, 

1 5 which includes the advanced service features, a number of Premium Sites and some 

proprietary metrics relevant to those Premium Sites. At a still higher charge, a 

participant might elect Institutional Service, which would include all sites plus some 

additional proprietary metrics, including an online form which allows the participant 

to enter third party advisors' forecasts and compare them to various benchmarks 

20 (generated from the contest data) for accuracy, bias, and efficiency evaluation (the 

"Yardstick"). The Yardstick can thus function as an element of due diligence 

evaluation when selecting and evaluating performance of fund managers, portfolio 

advisors, and staff economists. 

As noted above, participants in the contest are ranked and/or rewarded based 

25 on their performance. There may be separate rankings for each of a number of 

different variables, for each challenge, and for different time frames with respect to 

a single variable or a single challenge. Thus, for example, there might be rankings 

for the best overall predictions in the Stock Market Challenge, best long-term 

predictions in the Stock Market Challenge (where long-term might be defined, for 

30 example, as predictions of one year or greater), and best short-term prediction for 

Microsoft stock (where short-term might be defined, for example, as predictions of 

less than two weeks). Any other categories may also or instead be selected for 

ranking, with the actual ranked categories preferably being determined based on the 

interest of the participants or the interest of the population as a whole, bearing in 

35 mind that an important function of the rankings is to inform as to the relative merits 

of the various participants. The highest ranking participants in each category may 

be rewarded with cash, cBucks, merchandise, services, additional investment 
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information, or anything else of value. Alternatively, the chance to be highly ranked, 
as well as the corresponding publicity, alone might provide sufficient incentives to 
attract participants. 

Within each category, there are a number of different ways in which to rank 
5 the various participants. Preferably, ranking is based on a combination of the 
relative accuracy (e.g., percentile rankings) of a participant for each prediction event 
in which he participated. Thus, as a simple example, assume that a ranking is being 
conducted for the best predictor of the DJIA for the "end of next week" over a 
particular three-month period of time. Also assume that there are 7 opportunities per 

10 week (i.e., one closing time point per day) to predict the value of the DJIA at the end 
of next week. Assuming further that there are exactly 1 3 weeks in the subject three- 
month period of time, then there will be 7*13 = 91 prediction events in the category. 
However, not all participants will provide predictions for each prediction event. Some 
participants might not register until after the three-month period has begun. Still 

15 others might elect not to submit predictions for one or more days during the three- 
month period. 

Accordingly, in the preferred embodiment, the participants are given a 
percentile ranking for each prediction event in which they participate. For purposes 
of consistency in speaking of percentile rankings, as used herein an x percentile 

20 ranking will be understood to mean the top x% of the forecasters; thus, the 1®* 
percentile will mean the top 1%. In one embodiment, percentile rankings are 
assigned based on the absolute values of the differences between the predicted 
value and the realized value. 

Ties can be handled in a number of ways, such as assigning all tying 

25 predictions as the percentile midpoint that the tying group occupies; for example, if 
a group of forecasters predicted the same value and that group would have occupied 
from the 30^^ to the 40*^ percentile, everyone in the group could be assigned to the 
35*^ percentile. Alternatively, ties might be broken by ranking earlier unchanged 
predictions higher than later unchanged predictions; thus, if the closing time point 

30 were 6:00 p.m. and two tying predictions were last updated at 4:00 p.m. and at 5:00 
p.m., respectively, the 4:00 p.m. prediction would be ranked higher than the 5:00 
p.m. prediction. 

In this regard, it is noted that the time of the last prediction update might be 
factored into ranking in other ways besides tie breaking; for example, for each 
35 participant the absolute value of the difference between the participant's predicted 
value and the realized value might be multiplied by a factor (the "time factor") that is 
based on the time of the last prediction update. All of such techniques will tend to 
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encourage prediction updates as soon as new information is available to the 
participants, thereby increasing the size and continuity of the database available for 
combination forecasting. 

In the preferred embodiment of the invention, the percentile rankings for each 
5 participant are combined into a raw score that is compared against the raw scores 
of the other participants, and then the participants are ranked based on their raw 
scores. It is also preferable that participants are rewarded for consistency. For 
example, someone who is consistently in the 20**" percentile might rank higher than 
another person whose median or average is the 15^ percentile but whose various 
10 individual percentile rankings exhibit greater variation. Finally, it is also preferable 
to reward participants who have predicted more of the available prediction events 
higher than those who have predicted fewer. In addition, a participant may be 
required to participate in a minimum number of required prediction events in order 
to be ranked. In view of the foregoing considerations, the following formula is one 
15 example of a ranking formula for use in the forecasting contest according to the 
preferred embodiment of the invention. 



RawScore - median{percentilesY (1 + o")* 



where median(percentiles) is the median of all percentile rankings for prediction 
20 events in which the participant participated for the subject category, o is the standard 
deviation (or any other dispersion measure) of those percentile rankings, PEp is the 
number of prediction events in which the participant participated, PE^ is the total 
number of prediction events in the subject category, and x is a real number, typically 
greater than or equal to 0, which specifies the extent to which participants are 
25 penalized for failing to participate in the maximum number of prediction events 
possible, with 0 reflecting no penalty and higher values of x reflecting higher 
penalties. Using the above formula, a raw score can be calculated for each 
participant in the category, and then the participants with the lowest raw scores are 
ranked the highest. 

30 It should be understood that the above formula is exemplary only, and any 

other formula for combining percentile rankings (or other measures of relative 
accuracy), preferably that also incorporates the above-stated considerations, may 
be used instead. In addition, it is also possible to provide an overall ranking within 
a category by combining data that is indicative of the participant's absolute accuracy, 
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rather than relative accuracy. This may be particularly desirable in cases where 
relative accuracy is difficult to obtain, such as in the embodiments described above 
where fixed closing time points are not utilized, but instead each participant's 
prediction is deemed effective when submitted. In the event that absolute accuracy 
5 is utilized, it is still desirable that the raw score formula incorporate the other 
considerations (e.g., emphasis on consistency, reward for increased participation 
and for predicting earlier) stated above. 

However, one advantage of using relative accuracy such as percentile 
rankings in order to determine an overall ranking is that such relative accuracies 

10 facilitate comparison of participants who are predicting different variables. For 
example, one challenge might allow each participant to individually select a group 
of stocks whose prices the participant will predict. Although it may be unlikely that 
any two participants will select exactly the same stocks, each participant can 
nevertheless have a percentile ranking for each prediction event. The various 

1 5 percentile rankings can then be combined in the same manner as if all participants 
were predicting for the same stocks. 

The formulas for producing raw scores may also incorporate other 
considerations. For instance, as described above, the contest permits participants 
to estimate certain variables in a number of different prediction events. When 

20 ultimately combined to produce a raw score, how well a participant did in one 
prediction event is weighted the same as how well he did in any other prediction 
event. However, it is also possible to weight the prediction events differently. For 
example, in a category where the value of the DJIA is predicted for the "end of next 
week", the Saturday prediction (which is 1 3 days away from the realization time) may 

25 be weighted more heavily than the Friday estimate (which is only 7 days from the 
realization time). Similarly, prediction events may be weighted differently depending 
upon how many participants participated in each prediction event. 

Still further, the contest might be structured so as to permit participants to 
submit, in addition to a prediction value for each prediction event, an estimate of their 

30 own uncertainty regarding their prediction. In this case, prediction events for which 
the participant indicated a high degree of uncertainty might be weighted lower than 
prediction events for which the participant indicated a lower degree of uncertainty. 
In such cases, the number of prediction events for which the participant is deemed 
to have participated (e.g., PEp) preferably would be adjusted accordingly. For 

35 example, a prediction event for which the participant indicated a low degree of 
uncertainty might count as 1, while a prediction event for which the participant 
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rankings for individual prediction events and/or for overall rankings. The advantages 
of these features are described above. 



Prediction Input 

5 In the preferred embodiment of the invention, participants have the option of 

inputting their forecast data either numerically or in a graphical format. Preferably, 
the user interface that provides such capabilities is implemented in a Java applet 
which is downloaded into the participant's computer when the participant is logged 
onto the contest website, as described in more detail below. However, the software 

10 for implementing these capabilities can also be embodied in a separate software 
package and stored on a computer readable medium, such as a CD-ROM. The 
software for implementing these features is referred to herein as the "Workbench". 

Numerical input can be accomplished by having the participant type a specific 
numerical value into a designated field. For instance, assume that the participant is 

1 5 predicting what the value of a particular stock will be at the end of next week and at 
the end of 13 weeks, and believes that those values will be 180 and 200, 
respectively. In this case, the participant clicks on the "end of week" field for the 
stock, types in "180", clicks on the "end of 13 weeks" field, types in "200", and then 
(possibly after entering additional prediction and/or other data) clicks on the "submit" 

20 or similar button. This numerical technique of entering prediction data is very similar 
to what is commonly done in conventional techniques. 

However, in the preferred embodiment of the invention, participants may 
instead opt to enter their predictions in graphical format using the Workbench. 
Preferably, when a participant elects to submit data in graphical format, the 

25 participant is provided with a graph illustrating historical values for the particular 
variable under consideration and also indicating at least one time frame at which the 
variable can be predicted. One example of such a graph is shown in Figure 5A. 

Specifically, Figure 5A illustrates a graph 50 for predicting the value of a 
particular stock, in which the vertical axis 51 represents the price of the stock and the 

30 horizontal axis 52 represents time. The left side of the graph 50 illustrates historical 
values of the stock, preferably up until the current moment. The right side of the 
graph 50 includes bands for predicting future values of the stock, such as a band 54 
for predicting what the value of the stock will be at the end of next week and a band 
55 for predicting what the value of the stock will be at the end of 1 3 weeks. Although 

35 graph 50 includes only 2 bands, the graph may instead includes bands for all time 
frames available for prediction (e.g., 5), or any lesser number of time frames. 
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It is noted that the amount of historical data presented may be varied. In the 
example shown in Figure 5A, the initial time frame of interest is the "end of next 
week". Accordingly, the graph 50 is constructed to show daily fluctuations over a 
period of approximately five weeks. A different interval of time for presenting 
5 historical data may instead be presented, although lengthening the interval too much 
will tend to obscure shorter term fluctuations and, in the extreme, may make it 
difficult to discern fluctuations within the time frame of interest. On the other hand, 
shortening the interval too much might not provide the participant with enough 
historical dataon which to make a well-informed prediction. Thus, the preferred time 
10 interval for presenting historical data is from 1 to 20 times the time frame of interest 
and, more preferably, 3 to 1 0 times the time frame of interest. For example, for "end 
of next week" predictions, historical data might be presented for the past 3 to 10 
□ weeks. 

y Based on the foregoing considerations, at least the initial length of the 

^ 15 historical time interval preferably differs depending upon the forecasting time frame. 
M Once that initial interval has been provided to the participant, however, the 

S participant preferably also is provided with the option of expanding the interval (i.e., 

so that a longer interval of historical data is displayed in the same space on the 
^ screen), shortening the interval (i.e., so that a shorter interval is displayed in the 

y 20 same space on the screen), or zooming in on a particular segment of the interval 
%0 (i.e., so that the selected segment is displayed in a larger portion of the screen), in 

^ any combinations selected by the participant. 

Similarly, the range and scale of the vertical axis 51 preferably also may be 
adjusted as desired. In the present example, it is believed that a band around the 
25 fluctuations during the historical time interval displayed is most approphate. 
However, any other default range may instead be used. Once again, it is preferable 
that a default range and scale are provided and then the participant is given the 
option of altering the range of values displayed, as desired. In this way, the 
participant is given maximum flexibility to configure the display according to her 
30 needs. 

In order to enter a prediction, the participant simply moves her cursor to the 

appropriate band and clicks on the point where she believes the value will be at that 

time. Thus, if the participant wants to predict what the stock's value will be at the 

end of next week, she simply moves her cursor to band 54. In the preferred 

35 embodiment of the invention, when the participant moves the cursor into a prediction 

band the value on which the cursor is resting is automatically displayed. Thus, for 

example, when cursor 56 is moved into band 54, a value indicator 57 is automatically 
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displayed. In the particular example shown in Figure 5A, the cursor position 

corresponds to a value of "185". Therefore, the value indicator 57 displays "185". 

Moving cursor 56 up or down in band 54 causes value indicator 57 to display 

different values reflecting the cursor's vertical position. 

5 Designating a particular cursor position (such as by left-clicking a mouse 

button) causes value indicator 57 to convert into a text box which displays the same 

value that was indicated by value indicator 57. This allows the participant to change 

the indicated value to a completely different value, if desired, or simply to fine tune 

the prediction value with more precision than may be possible given the limited 

10 display screen resolution. In particular, the participant can do either by simply 

moving the cursor within the text box and using the computer keyboard to delete or 

enter new digits. Once such changes have been made, or in the event the 

O participant is satisfied with the prediction indicated by the initial cursor designation, 

y the participant can submit the prediction, such as by clicking on a "confirm", "submit" 

^ 1 5 or similar button (not shown) on the display. Otherwise, the participant can cancel 

1^ the prediction, such as by clicking on a "cancel" or similar button (not shown) on the 

^ display, and then moving the cursor to a different position in the band. In either 

event, the participant can move the cursor to a different band in order to enter a 

prediction for a different time frame. 

y 20 As noted above, Figure 5A illustrates bands 54 and 55, representing two 

m3 different prediction time frames. However, the appropriate length of the historical 

data time interval displayed for the two might be different. In fact, even including 

band 55 (which is the end of 1 3 weeks) significantly shortens the amount of time that 

can be displayed within a given display width, particularly if one wishes to maintain 

25 a constant scale on the horizontal axis. This problem is even further exacerbated if 

more than two different time frames are displayed on the same graph. Therefore, 

if more than one time frame band is presented on the initially displayed graph, the 

participant preferably is given the option of reconfiguring the graph so as to optimize 

the display of historical data for each different band on the initial graph. 

30 For example, to so reconfigure graph 50, the participant might move cursor 

52 into band 55, right click with her mouse, and then select "reconfigure" or an 

equivalent instruction. In response, graph 60 (shown in Figure 5B) is generated. 

Because the present time frame is further out than the previous, historical data are 

provided over a longer time interval in graph 60. Specifically, historical data are now 

35 shown over a period of approximately 3 years, rather than 5 weeks. However, once 

again this display preferably is only the initial default display and the user can then 

custom-configure the display in other ways, such as those described above. 

37 



35512-00006 




Predictions are then submitted in the same manner as described above in 
connection with Figure 5A, i.e., clicking in band 62 (which corresponds to band 55), 
using the text box 57 to fine tune the prediction if desired, and then clicking on the 
"submit" button. 

5 Alternatively, a participant may avoid using the graphical input completely by 

typing a numerical prediction in a provided text box, such as text box 58 beneath 
band 54 or text box 59 beneath band 55, Also, for purposes of refining or changing 
a prediction entered using the graphical method described above, the numerical 
value of the graphically input prediction may be displayed text box 58 or text box 59, 
10 as applicable, rather than in a pop-up text box 57 next to cursor 56. 

It is noted that, initially, participants may be uncomfortable clicking on arbitrary 
areas within a band. Accordingly, an alternate version would be to present users 
O with discrete "buttons" for inputting predictions. Specifically, displayed on the left 

2 side of the graph would be the historical trend of recent past values up to the present 

^ 15 time in a manner similar to that shown in Figure 5B. Then, on the remaining 

^ y 

1=^ right-hand portion of the graph, for each future time horizon, several buttons would 

2 be displayed for entering the participant's prediction. The available buttons can be 
^ scaled to offer a variety of choices consistent with the measure being considered. 

3 Preferably, the buttons would be arranged vertically from the highest value (or 
y 20 change of value) to the lowest value (or change of value) on the screen and would 

correspond to the time frame shown and indicated on the time axis. Participants 
preferably still would have the option of providing an exact numerical prediction 
instead of selecting a button for each prediction. When the predictions for each time 
frame for each variable have been entered, the participant would click to submit 

25 those predictions. 

Figure 6 illustrates one example of the foregoing embodiment. Shown in 
Figure 6 is a graph 80 for predicting the end of next week's value of the one-year 
treasury bill rate. Portion 82 of graph 80 illustrates historical values of the treasury 
bill rate over a time interval of approximately 5 weeks. On the right side of graph 80 

30 are eleven buttons, such as buttons 84 to 86, that range from up 75 basis points to 
down 75 basis points. With this arrangement, participants can graphically predict 
what the value will be, in 15 basis point increments. Thus, for example, if one 
believes that the rate will be roughly the same as the most recent historical value, 
she would click button 84. Similarly, to indicate a prediction of "up 30 basis points" 

35 from the most recent historical value she would click button 85, and to indicate a 

prediction of "down 45 basis points" she would click button 86. Preferably, when a 

prediction is entered in this manner, the corresponding value (or change in value) is 
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indicated in a text box, such as text box 88. The participant can then edit this value, 
such as for fine tuning, prior to submission. Alternatively, the participant nnight 
completely bypass the graphical input and instead directly input her prediction into 
text box 88. 

5 The above graphs may be provided in a number of different ways and may 

include a variety of different features designed to enhance their usefulness to the 
participants. For example, the division between the historical data and the predicted 
future data might be designated by a change in color or by using a broad line, unique 
to the display. Similarly, the bands for prediction time frames may be designated by 

10 a change in color, a column of symbols, or any other method. In addition, if there is 
a large number of data points (whether historical or prediction bands) displayed, the 
date corresponding to any given time point might appear as a pop-up as the cursor 
is dragged across an imaginary vertical line through that point. 

Also, additional data can be linked to the cursor position in the x coordinate 

15 (e.g., a specific date) and/or the y coordinate. For example, historical news 
headlines, date-specific commentary, date-specific prediction data, and other 
information may be linked to the date corresponding to the cursor position. Thus, at 
any given point within the historical data portion of the graph, or after blocking an 
interval of the historical portion, the participant might right click her mouse and then 

20 select "news headlines" from the menu, whereupon a list of news headlines for that 
time point or time interval, as applicable, would be downloaded to the participant's 
computer. Similarly, articles and date-specific prediction information may be linked 
to the dollar value corresponding to the cursor position. Thus, right clicking and then 
selecting "prediction statistics" from the menu might display various prediction 

25 information relating to that dollar value of the subject stock, such as the percentage 
of forecasters who have predicted that the stock price will reach at least that dollar 
value within the subject time frame. Such linked information m ight be pre-designated 
or generated on-the-fly. As examples of the latter case, a linked information request 
might cause a search of the Archives or might initiate certain processing of data 

30 within the prediction database. 

Rather than displaying multiple prediction time frames on the initial graph, a 
single prediction time frame (e.g., the end of next week) might be displayed on the 
initial graph (e.g., with the default historical data for that prediction time frame). 
Then, after the participant submits a prediction for that time frame, the graph is 

35 automatically reconfigured to display the next prediction time frame (e.g., the end of 
13 weeks, together with the default historical data for that prediction time frame). 
This process would then continue until predictions had been submitted for all 
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prediction timeframes. When determining howmany different prediction timeframes 
to indicate on a single graph, there generally will be a tradeoff between the amount 
of historical information that can then be provided and the convenience of being able 
to enter predictions for multiple time frames on a single graph. 
5 When predicting values for multiple related variables, the graphical user input 

can be provided in several different ways. For example, the Treasury Yield 
Challenge involves forecasting the yields on 5 bonds of differing maturity at 5 future 
points in time. The participant could accomplish this task by repeating any of the 
exercises described above for each of the five different variables (i.e., for 3 month 

10 and 1 year bills, 5 and 10 year notes, and 30 year bonds). If a different graph is 
displayed for each different time frame, this may require the display of 25 different 
graphs. Moreover, when using such a process it might be difficult to visualize how 
the different variables interrelate. 

One solution to this problem might be to permit the participant to display 

1 5 graphs for multiple variable/time-frame combinations in a stacked manner, and then 
enter predictions on each graph as described above. This embodiment is illustrated 
in Figure 7, in which graphs 91 to 95 indicate prediction entry graphs for entering 
predictions for the end of next week for the five respective variables included in the 
Stock Market Challenge. Specifically, a participant simply clicks in the appropriate 

20 prediction band 101 to 105 to enter a prediction for each variable in the Challenge. 
Also provided are text boxes 111 to 1 15, respectively, for fine tuning predictions or 
bypassing the graphical input altogether. Alternatively, a single text box might be 
provided for all of the graphs displayed. 

The foregoing embodiment can permit the participant to view data for a 

25 numberof different variables (or time-frame/variable combinations) at the same time. 
However, this embodiment typically would require the participant to have a fairly 
large display screen, and therefore such a technique might be impractical for most 
participants. In addition, it may be desirable to provide the participant with the 
means to evaluate her predictions from different points of view prior to submitting 

30 them. 

Specifically, it may be desirable to permit various display manipulations 
between when the predictions are "entered" by the participant and when they are 
"submitted" to the contest. For example, with respect to the Treasury Yield 
Challenge, the participant might individually estimate the time series of the yield on 
35 each instrument, and then obtain a display (a "time series comparison view") that 
includes superimposed curves corresponding to multiple variable/time-frame 
combinations (e.g., each in a different color) on a single graph, enabling the 
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participant to view historical and forecast values for multiple variables (e.g., the yields 
for all five instruments). This is illustrated in Figure 8, which shows historical data 
121 to 125 for the five variables, as well as the current predictions 131 to 135, 
respectively, for the time frame of interest. Further corrections could be made at this 
5 point if the forecast co-movements did not appear correct, such as by returning to the 
time series view for a single variable and then changing the prediction value(s). 

In addition to time series views, the participant preferably also has the option 
to request the cross-section (rotation) of the time series comparison view. With 
respect to the bond example given above, this view is referred to as the "cross- 
10 maturity comparison view", and shows 5 different curves (for the five different 
prediction timeframes) of yield rate plotted against maturity date. Accordingly, this 
view provides another check point for making corrections to the participant's 
predictions. 

It is also noted that, rather than using the time series comparison view and the 

15 cross-section (rotation) of the time series comparison view solely for verification 
purposes, a participant might also be permitted to enter predictions within those 
views. Because multiple variables are displayed in the time series comparison view, 
some means for designating the variable for which a prediction is being entered 
generally must be provided, such as clicking a radio button corresponding to the 

20 variable on the display. One advantage of this technique is that the participant is 
permitted to display data and enter predictions for different variables on the same 
graph, thus providing a constant view of data for interrelated variables. 

As a further alternative to the above technique, the participant might initially 
forecast values within the cross-section (rotation) of the time series comparison view 

25 (e.g ., in the same manner described above for entering predictions in the time series 
comparison view) and then request that the data be re-formatted into the time series 
comparison view for validation and/or corrections. Upon receipt of such a request, 
the Workbench automatically would generate the time series comparison view. 

In a still further embodiment, the participant has the option of entering and/or 

30 modifying predictions in either the time series comparison view or the cross-section 
(rotation) of the time series comparison view and then switching back and forth 
between the different views. By iteratively fine tuning in each view, and then having 
the Workbench transform the data into the other view, the participant often will be 
better able to produce and submit forecasts that are more consistent with her actual 

35 expectations. In general terms, each of the different views can be provided either 

for reference purposes only or for both reference and prediction input, depending 

upon the specific embodiment of the invention. 
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Challenges that flow from the yield curve can be handled in a similar manner. 
In terms of the risk spread, prediction using the time series view can be repeated 
with an Aaa series imposed or, at the user's option, the difference may be graphed 
(e.g., 1 year Aaa yield - 1 year treasury yield). Beyond that point, it may be more 
5 useful to graph the spreads (e.g., to avoid ten lines on a graph). The time series of 
the spreads at different maturities would be presented in a style similar to the "time 
series comparison view", and the future term structure of spreads in a style similar 
to the "cross-maturity comparison view". The same input modes would apply, and 
the participant would again have the ability to examine her predictions from different 

10 perspectives prior to submitting them. 

In short, the Workbench preferably can: (1) allow the participant to submit 
individual time series estimates, aggregate them, and then take the cross section; 
or (2) allow the participant to submit cross-section estimates, and convert those 
estimates into aggregated and disaggregated time series. 

1 5 To aid in forecasting, other data curves for other variables preferably can be 

presented as overlays to the data curves for the prediction variables. These data 
curves preferably can either be displayed contemporaneously with those of the 
prediction variables, or can be offset with time leads or lags, as specified by the 
participant. In addition, arbitrarily selected values preferably can be graphically 

20 added to, or multiplied by, the various data curves, as desired by the participant so 
as to provide the participant with the maximum flexibility in manipulating various 
historical and prediction data to further aid in the participant's individual forecasting. 
The result can be a "visual" regression analysis that may be highly useful in 
performing the various forecasts. 

25 Thus, the graphical display for entering predictions can be configured in a 

variety of ways to achieve maximum flexibility. In particular, the display interface 
according to the invention can provide graphs showing any combination of different 
variables and different time frames for entering predictions. Moreover, the present 
invention can permit each individual participant to customize her display in this 

30 regard so as to accommodate her own preferences. 

In addition to displaying historical data for one or more variables, participants 
preferably also have the option of displaying their own previous predictions and/or 
the previous predictions of other participants. With regard to the latter, other 
participants' predictions may be displayed, for example, as a time series of the 

35 central tendencies of those predictions, together with an indication of the dispersion 
measure for those predictions at each point in time. 
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An exampTels illustrated in Figure 9, in which a measure of central tendency 
150 for the other participants' predictions over time is plotted, together with an 
indication 152 of the dispersion around that central tendency. Preferably, the 
dispersion band 152 is symmetrical around the central tendency curve, with the 
5 upper limit of the dispersion band 1 52 being equal to the central tendency value plus 
the dispersion measure and the lower limit being equal to the central tendency value 
minus the dispersion measure. It is noted that any measure of central tendency 
(e.g., mean, median, trimmed mean or median) and any measure of dispersion (e.g., 
variance or the EUM measure described below) may be used, and the individual 

10 participant may even be given the option of which such measures to plot. In any 
event, the ability to display such information can provide a useful tool when a 
participant is attempting to formulate her own predictions. The foregoing information 
preferably may be plotted for all participants or any subset thereof (e.g., only 
participants in the requesting participant's Universe), preferably at the discretion of 

15 the requesting participant. 

An additional statistical tool that may be provided is a regression package 
using preselected data and data transformations which will allow users to create their 
own statistical forecast models. Specifically, users may select dependent and 
independent variables from menus and then will choose which transformations (e.g., 

20 leads, lags, logs) to apply to the series prior to statistical estimation. 

The Workbench preferably also provides statistical analysis on the 
participants' past forecasts versus realizations (i.e., errors). More preferably, the 
Workbench not only provides measures of error and bias, but also compares the 
forecasts to a number of implied models and identifies the closest model (e.g., "the 

25 subscriber forecasts as if she were using the following equation . . . ). The identified 
implied model preferably is then compared to optimal models to suggest what the 
participant may be under or over weighting. Both of these features preferably are 
included in the diagnostic and tutorial sections of the Workbench. 

Thefollowing describes a representative example of graphical input according 

30 to the preferred embodiment of the invention. First, the participant selects the 
Interest Rate challenge as the challenge in which she wishes to participate. Next, 
the participant selects a view. Seven possible views exist, two summary views and 
five different forecast entry tool views. The summary views include the "time series 
comparison view", and the "cross-maturity comparison view". The five forecast tool 

35 views are for forecasting 3 month and 1 year treasury bill yields, 5 and 1 0 year notes, 

and 30 year bond yields and are similar to Figure 5B. By selecting the 1 year t-bill 

forecast, a graph will be displayed with that variable's realized (historical) values 
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displayed on the left and five bands displayed on the right corresponding to each of 
the forecasting horizons (e.g. end of next week (ENW), 4 weeks from ENW, 13 
weeks from ENW. 52 weeks from ENW, and end of year (EOY)). 

Before entering her forecasts, the participant may want to see old non- 
5 realized forecasts or other historical series. To select non-realized forecasts, two 
checkboxes are provided to allow the participant to display: (1) her most recent 
forecast (either for the current round if already entered, or from the previous week's 
game); and/or (2) last week's median forecast for the variable selected. As to other 
historical series, the participant may select, for example, her own forecasts or the 

10 overall median forecasts for the period. These are overlaid on the realized values 
to facilitate analysis. As each additional series is selected, a labeled data display 
field appears. When the user selects a specific historical time (represented by 
dragging a vertical indicator to the desired position, values for each variable appear 
in the display fields. Other tools may also be provided which allow the participant to 

1 5 transpose or forecast values. 

Next, the forecasts are entered by selecting the time horizon (forecast for next 
Friday is default) and entering the value either numerically in a text box below the 
band, or by clicking on the appropriate spot within the band to enter the value and 
then fine tuning, if desired. The foregoing is then repeated for each band for the 

20 current variable and then all five time horizons are forecast for the other four 
variables. Finally, the two summary views are reviewed, the forecasts adjusted as 
desired, and then the forecasts are submitted upon completion. 

The user interface according to the invention may also be configured in any 
of a number of different ways so as to permit a participant to submit an estimate of 

25 her own uncertainty regarding her forecast. For example, upon entering each 
forecast, such as in any of the manners described above, the participant may have 
the option of clicking one of several radio buttons, each indicating a different level 
of confidence (e.g., "very high", "high", "medium", "low", "very low"). Alternatively, 
the participant may be provided with the option of dragging a slide bar in order to 

30 indicate her level of confidence (on an approximately continuous scale), for example, 
from "very high" to "very low" confidence. 

As noted above, in the preferred embodiment of the invention, the above 
graphs are provided over an electronic network, such as the Internet, by means of 
a Java applet. The following describes one embodiment for implementing the above 

35 functionality. 

When a participant initially selects the "Tournament" page link from one of the 

other web pages of the contest website, the participant's browser sends an IP packet 
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addressed to the contest website server requesting that page. In response, the 
contest website server downloads a Java applet to the participant's computer. In the 
preferred embodiment of the invention, the Java applet includes instructions to 
execute the process steps illustrated in Figure 10. 
5 Referring to Figure 10, in step 162 configuration information is obtained. 

Based on the identity of the participant (e.g., provided at login or stored as a cookie 
from a prior login) the applet will obtain configuration information from the server. 
Such information preferably includes (but is not limited to) the "default" variable 
(generally the variable most often forecast, or last forecast), specifications of all 

10 variables that previously have been forecast by this participant, plus any other 
variables to which the participant may have access, given her service level. Each 
variable preferably has associated with it certain additional configuration information, 
such as earliest date (DTe), earliest displayed date (DTd), and granularity (G). 

In step 163, the applet queries the participant regarding how she would like 

1 5 the data displayed. For instance, the participant might be provided with the option 
to have the historical and prediction data displayed (1) one variable with one 
prediction time frame at a time; (2) multiple variables in stacked graphs; (3) multiple 
variables superimposed on the same graph; or (4) any other combination of the 
various display options discussed herein. When the participant provides her option 

20 selection, such as by clicking on a radio button, or a combination of radio buttons 
with each set directed to a different feature, the applet stores this information for later 
use. 

In step 164, historical data are retrieved from the server for the interval from 
DTd to present, at granularity G, for the "default" variable. Then, data are retrieved 

25 from the server for the most recent forecasts of the "default" variable. 

In step 166, the applet either graphs or merely stores the historical and 
prediction data for the current variable, depending upon the particular variable and 
the current display instruction. For example, if the current variable is the "default" 
variable, the applet preferably will display a graph with the "default" variable 

30 (historical and most recent forecasts) according to the display options selected by 
the participant. On the other hand, if the applet has just completed downloading 
information for a different variable, whether that information is displayed or merely 
stored preferably will depend on the display option information provided by the 
participant. For example, if the participant elected to have the variable 

35 superimposed on the same graph or displayed on a stacked graph, the information 

for the variable will be immediately displayed in the appropriate manner. However, 

if the participant elected to have only one variable displayed at a time, the 
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information for thecurrent variable will be merely stored until the participant is ready 
to have it displayed. In order to graph particular values, each data point is mapped 
onto a location on the display as a function of its value, with the scale of the graph 
being determined by DTd, G and the maximum and minimum data values over the 
5 displayed interval. 

In step 168, a determination is made whether the current variable is the last 
variable. If so, then processing proceeds to step 1 70 to await additional commands 
from the participant. If not, then processing returns to step 164 to retrieve data for 
the next variable. 

10 In step 170, the applet waits for additional participant instructions. Such 

instructions might include, for example: (1 ) request a graph of a variable that has not 
yet begun loading; (2) request a graph of a variable that has not previously been 
forecast, and so has not been queued for loading; (3) request an earlier time interval 
for a variable (prior to that variable's DTd but not earlier than DTe); (4) request a 

1 5 smaller time intenw^al for a variable (indicating that data at finer granularity than the 
current value of G is needed); or (5) request that data for a variable that has already 
been loaded be superimposed as a new curve on an existing graph. It should be 
understood that the foregoing are merely exemplary; the participant may be 
permitted to request any display of data, as described in more detail above. 

20 In step 172, it is determined whether new data are required. For example, 

with regard to the examples given in connection with the discussion of step 170, 
requests (1) to (4) would require additional data from the server, while request (5) 
would not. If more data are required, steps 164, 166 and 168 are repeated for each 
required variable in order to obtain and either store or graph such additional data. 

25 Otherwise, processing proceeds to step 174. 

In step 174, the participant's instruction is processed using stored data. For 
example, with respect to request (5) described above in connection with the 
discussion of step 1 70, the data for the additional variable are retrieved from memory 
(e.g., RAM) or from mass storage (e.g., hard drive), as appropriate, and then are 

30 converted into graphical display data and added to the existing graph. Upon 
completion of step 174, processing returns to step 170 to await the next instruction. 

In the preferred embodiment of the invention, the data are stored at the server 
in a database (preferably relational), arranged as a set of named tables. Each table 
consists of a number of rows representing the sets of data to be stored. Each table 

35 also consists of named columns representing the components of each row. The 
applet's access to the database is assumed to use a standard data access protocol 
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such as JDBC, wifh a driver (if necessary) to provide connectivity to the remote 
database. 

Each of the above data definitions can be interpreted as a query referring to 
one or more tables and requesting sets of data that satisfy the specification. Thus 
5 (for example), "Retrieve historical data from the server for the interval from DTd to 
present, at granularity G for the 'default' variable" could be represented as a pair of 
queries similar to: 

Select * from SPSOORealizedHistory where (StartDate = 'DTd') and (EndDate = 
1 0 CURRENT DATE) and (Granularity = 'G') 



And 



Select * from SP500ForecastHistory where (StartDate = 'DTd') and (EndDate = 
1 5 CURRENT DATE) and (CustomerlD = '1 23456') 

In this example, the table SPSOORealizedHlstory might contain the following 
columns: 

StartDate A date representing the start of the time interval 
20 EndDate A date representing the end of the time interval 

Granularity An integer representing the distance between data points 
Count An integer representing the number of data points in the interval 

Data A BLOB (Binary Large Object) consisting of the array of data points as 

floats 

25 

And the table SP500ForecastHistory might contain the following columns: 

CustomerlD An integer representing the identity of the customer 
StartDate A date representing the start of the time interval 
EndDate A date representing the end of the time interval 
30 Count An integer representing the number of data points in the interval 

Data A BLOB (Binary Large Object) consisting of the array of data points as 

floats 

Note that the CustomerlD represents the identity of the participant, as 
35 determined above. By Preformatting rows into a relatively small number of 
collections, the load on the database server is significantly reduced. Alternatively, 
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it is feasible to cache all data in a "middleware" application and then communicate 
between the client and server via a proprietary protocol. This has the advantage that 
it does not require any database activity unless some of the data requested is not 
already present in the cache. Multiple variables may also be combined into one 
5 more elaborate table to simplify adding new variables. 

If dispersion information is also available to this participant, then equivalent 
queries and table structures would be used, but the specific tables would have larger 
data arrays, as each "element" of the array would itself be an array of percentile and 
median values. 

10 In a similar fashion, and using the known identity of the participant, the 

database server or middleware application is queried as to the most recent values 
forecast for a given variable. 

When a newforecast value is entered and confirmed, the data are transmitted 
back to the database server using an update statement such as: 

15 

Update SP500Forecasts set EndOfYear = '1510\ CEndOfYear = '0.85' where 
CustomerlD = '123456' 



In this example, the table SP500Forecasts might contain the following columns: 

20 CustomerlD An integer representing the identity of the customer 

The participant's current forecast for the end of next 
week 

The participant's current forecast for 4 weeks from the end of 
next week 

The participant's current forecast for 1 3 weeks from the end of 
next week 

The participant's current forecast for 52 weeks from the end of 
next week 

The participant's current forecast for the end of the year 
30 CEndNxtWeek The participant's prediction certainty for the forecast for the end 

of next week 

The participant's prediction certainty for the forecast for 4 weeks 
from the end of next week 

The participant's prediction certainty for the forecast for 13 
35 weeks from the end of next week 



End Nxt Week 



EndNxtWeek4 



25 EndNxtWeek13 



EndNxtWeek52 

EndOfYear 
CEndNxtWeek 

CEndNxtWeek4 

CEndNxtWeek13 
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CEndNxtWeek52 The participant's prediction certainty for the forecast for 52 

weeks from the end of next week 
CEndOfYear The participant's prediction certainty for the forecast for the end 

of the year 

5 

Generally, the forecasts made will also be accumulated in another table for tracking 
and data analysis purposes. 

Although the above-described embodiment utilizes a Java applet, it is noted 
that the same process can be executed by a software application which is 
10 permanently installed on the participant's computer. Also, as noted above, rather 
than continuously having to download data from the server as needed, the software 
could store some portion of such data (either permanently or temporarily, e.g., in the 
5 latter case managing such storage and deleting the stored data after some period 

W of time) in order to reduce the required download times. 

m 15 

^ Communitv-Selected Content 

01 In addition to providing participants the opportunity to submit predictions and 

k become ranked, as described above, the website according to the preferred 

S embodiment of the present invention also includes certain resources that are 

m 20 available to the participants (or users), although the amount of resources provided 

to any single participant may depend upon the subscription level of the participant. 

Among these resources, the contest website according to the preferred 

embodiment of the invention includes a number of distinct content areas (such as 

100 different areas) on various topics of interest. These content areas are referred 

25 to herein as "Soapboxes". Moreover, although preferably implemented as content 

areas within the contest website, it should be understood that the Soapboxes may 

instead be implemented as separate websites, with the contest website including a 

link to each such Soapbox website. When included in a financial/economic 

forecasting contest website, the Soapboxes preferably are initially allocated 

30 according to the approximate representation of similar topics in the financial press 

and, to a lesser extent, the content of existing Internet sites. 

Each Soapbox preferably has a title, an author, a "current headline" and a 

"feature article". These elements can be used for personalized home page 

construction. In the preferred embodiment of the invention, Soapboxes are designed 

35 to allow individuals or entities (the Soapbox Proprietors) to structure community 

interaction around a topic, philosophy, or point of view. Thus, in addition to simply 

including information, the Soapbox sites might include chat rooms, live broadcasts 
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(either interactive or non-interactive) and other mechanisms designed to elicit user 
feedback. In order to provide access to the Soapboxes, one page of the contest 
website might include an overview for, and hyperlink to, each Soapbox, with each 
overview including the Soapbox title, headline, author, and an initial part of the 
5 "feature article". 

It is also preferable that a search mechanism allows users to find relevant 
Soapboxes based on keywords. For example, a neural net (or similar mechanism) 
might weight search terms and matching documents to enhance precision and recall. 
Additionally, users can be provided with the ability to ask to see Soapboxes "similar" 

10 to a particular Soapbox. 

In the preferred embodiment, the Soapbox Proprietors sponsor the content 
of their Soapboxes and receive a stipend, based upon popularity. It is also 
preferable that, periodically, the least popular Soapboxes are turned over to new 
Proprietors. It is further preferred that all Soapbox Proprietors must be subscribers 

15 and must submit a prescribed minimum number of forecasts. 

The following are the preferred rules for the Soapboxes: (1) candidates 
wishing to sponsor a Soapbox must submit the proposed Soapbox title, a 100 word 
description of the Soapbox, the Soapbox type (e.g., one of commentary, moderated 
discussion, or narrated resource collection), three writing samples (each of 500 

20 words or more), and three personal references; and (2) each Soapbox item 
accessed by a unique individual receives a point bump; (3) accessed Soapbox items 
can also be rated, with a neutral rating equivalent to no rating (the item receives only 
the default point bump), positive ratings worth positive (or more) points, and negative 
ratings worth negative (or less) points; (4) points that accrue to Soapbox items also 

25 accrue to the Soapbox owner; (5) access to archived Soapbox items also accrues 
(preferably lesser) points to the Soapbox owner; (6) periodically, such as every 
month, the lowest ranked (such as lowest 3%) of Soapboxes are "canceled" and 
Soapbox slots thus opened are filled from waiting candidates; (7) stipends are paid 
(based on the prior rating period) to Soapbox owners based on their ratings; (8) 

30 ratings are delivered weekly to Soapbox owners; (9) the highest rated (such as the 
"Top 10" and "Top 40") Soapboxes are highlighted, such as by including an 
appropriate logo indicating that status, and the highest rated Soapboxes (such as the 
"Top 10") are announced via press release every rating period; (10) Soapbox 
candidates must have contributed forecasts for at least three months prior to 

35 submitting their "application" and must continue to submit forecasts on a prescribed 
basis as a condition of maintaining their Soapboxes; (1 1 ) there exists an Acceptable 
Use Policy; (12) there exists an Oversight Board (preferably composed of contest 
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Staff members, Soapbox Proprietors, representatives from the user community, and 
outside representatives) charged with enforcing the Acceptable Use Policy - the 
Oversight Board can discipline and/or remove Soapbox owners, but such actions 
must be published within the Soapbox area; and (1 3) the foregoing rules are posted 
5 in the Soapbox area. 

The website according to the preferred embodiment of the invention also 
includes a Digital Text Library (DTL) which is configured as an extensive, diverse 
collection of text materials for reference and research. The DTL preferably includes 
the Dumpster, the Archives, the Academy, the Research Room, the Reading Room, 
1 0 and the Journal Room . 

The Dumpster and the Archives contain community generated content, 
maintained primarily by the Soapbox Proprietors. 
Q The Dumpster is the repository for unreviewed and unedited text based 

Q material, uploaded by virtually anybody. Using a community scoring system (such 

15 as described below), Dumpster items may be elevated into one of the other 
^ collections. Dumpster contributions may also be identified by Soapbox Proprietors 

m as items to be sponsored into Archive status; in such cases, the sponsoring Soapbox 

Proprietor's name preferably will be included as part of the descriptive information 
5 when the Dumpster item is promoted to Archive status. To the extent possible, 

~ 20 Dumpster contributions are full-text searchable. The Dumpster content is not 
included in other site searches but is separately indexed with a significant disclaimer 
being displayed prior to searching or accessing these files. 

The Archives is the primary full-text searchable database of materials 
provided by and through Soapbox Proprietors as well as materials elevated from the 
25 Dumpster. Soapbox Proprietors preferably can submit materials directly into the 
Archives. As part of Soapbox construction. Proprietors can choose to incorporate 
Archive Submission tools, in which community members submit materials to a 
Soapbox Proprietor for review prior to uploading into the Archives. When a Soapbox 
Proprietor approves a submission, the Soapbox Proprietor uses a Community 
30 Upload Tool to enter the contribution into her Soapbox. After a minimum amount of 
time as part of published Soapbox content, the submission is automatically uploaded 
into the Archives. This is the same process the Proprietor uses for uploading her 
own materials into the Archives. As discussed below, Archive materials preferably 
generate cBucks for the content provider as well as for the sponsoring Soapbox 
35 Proprietor when the materials are viewed by others. 

The following are the preferred rules in connection with the Archives: (1) 
Soapbox contents are automatically archived; (2) feature stories and other material 
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generated by the editorial staff ofthe contest are automatically archived; (3) Soapbox 
owners can sponsor items to be added to the Archives; (4) there is a special area of 
the Archives called the Dumpster - anyone can add material to the Dumpster; (5) all 
items in the Archives have a rating (point value) derived from cumulative accesses; 
5 (6) each item accessed by a unique individual receives a point bump; (7) accessed 
items can also be rated, with a neutral rating equivalent to no rating (the item 
receives only the default point bump), positive ratings worth more points, and 
negative ratings worth negative points; (8) standard searches exclude the Dumpster 
and return items are sorted first by keyword match, then by rating and/or access 

1 0 points; (9) Dumpster searches search only the Dumpster but return items sorted in 
the same way as standard searches; (10) highly rated Dumpster items (e.g., those 
exceeding a specified threshold score - see the discussion below) are "promoted" 
out of the Dumpster to the Archives proper; (11) there is a "top 40" area of the 
Archives, consisting of the forty highest rated items and the forty highest rated 

1 5 authors within the last week, the last month, and cumulatively; (12) items not meeting 
the Acceptable Use Policy are deleted; and (12) the Archive rules are posted in the 
Archives. 

The Academy and the Research Room are a combination of contributed 
materials, solicited materials, and freely available materials consolidated from 

20 elsewhere on the web. 

The Academy is a repository primarily for student papers, theses, 
dissertations, and other academic writings primarily by undergraduate and graduate 
students. These materials may be solicited through several "outstanding paper" 
competitions. Papers will be submitted to the Academy Editor, a staff position, who 

25 will catalog and then upload acceptable submissions into the Academy. In general, 
each submitted paper must be sponsored by a college or university faculty member. 
Each semester, there are hundreds of quality research papers on investment, 
business, economics, and forecasting topics produced by students as part of their 
training. Typically, the results of this research are completely lost following the 

30 semester's end. While probably not publishable in academic journals, in part 
because of the very specific scope ofthe research (e.g., "What Happened To Bank 
Stock Prices After Clinton's Reelection?", "The Performance of United Airlines Stock 
Following the Northwest Airlines Pilot Strike"), many of these papers would have 
interest to the broader financial and economic community either for direct review or 

35 to provide assistance in other research. For example, investors could review 

comparative industry research and prospective employers could identify students 

with specific topical experience. The Academy entries preferably are full-text 
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searchable. As in other sections of the website, readers are able to rate papers and 
search results can be ordered by rating score. 

The Research Room is a repositoryfor professionally written research papers. 
The Research Room content preferably originates from three primary sources: 
5 professionals may submit copies of working papers, research reports, and other text 
to the Research Librarian; the contest website may sponsor research on specific 
topics, including academic research performed using the contest proprietary 
databases; and, the contest's Research Librarian can regularly add freely available 
research papers to the permanent collection. Sources of such research papers 

10 include numerous state and federal government agencies, members of the Federal 
Reserve System, international not-for-profits, foundations, and numerous academic 
departments which freely distribute working papers and faculty research summaries. 
These documents may include PDF files in addition to fully searchable text. The 
Research Librarian may do initial keyword labeling for contributions based on 

1 5 abstracts or based on a physical review of the documents. In addition to providing 
ratings, readers may have the ability to provide additional comments on Research 
Room items, which preferably also are searchable and include a back-reference to 
the reviewed document, allowing for the community to dynamically enhance the 
keyword and metalabels, particularly for lengthy documents which are not full text 

20 searchable. 

The Reading Room preferably contains the full text of books and monographs 
which are either in the public domain or for which the contest website has licensed 
or purchased e-text rights. The Reading Room preferably provides these books in 
an encrypted PDF format with full text search, and makes the encrypted texts 

25 available for reading using the contest's online text reader. The Reading Room 
preferably also has pointers to the contest Book Shop which sells custom printed 
versions of these texts. While community members and Soapbox Proprietors are 
able to suggest new acquisitions for the Reading Room, the Reading Room 
preferably is controlled solely by the contest staff members (e.g., the Reference 

30 Librarian). 

The Journal Room preferably contains fully referenced academic journals 
distributed electronically and sponsored by the contest staff members. The following 
are examples of items which may be included in the Journal Room: 

35 ■ a Journal that primarily discusses practitioner oriented investment strategies and 
forecasting using consensus forecast data; 
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Letters that include shorter practitioner oriented articles including methodology, 
empirical results, and new models with application to practical forecasting and 
investing; 

a Journal of Computation, Economics, and Statistics - an outlet for serious 
5 methodological and empirical research utilizing consensus forecasting data; and 
• Transactions - an outlet for serious academic research which has had difficulty 
being published in other outlets primarily because of "taste trends" in academia. 

The foregoing items may be published by the contest staff members and include 

1 0 editorial boards whose members are Soapbox Proprietors and recognized scholars. 
All accepted contributions preferably are fully indexed. 

Each item in the Digital Text Library preferably is assigned a permanent file 
name and unique URL, and has an associated catalogue entry which may be 
updated. The basic catalogue entry preferably includes the URL of the originating 

15 site, the document type, creation date, acquisition date, key words or abstract 
(especially for documents which are not full text searchable), title, authors and 
affiliations, the identity of the entry sponsor if any, and current rating information for 
the document. Where appropriate, additional data may be included in the catalogue 
entry. However, Dumpster entries preferably have a more limited catalogue entry. 

20 Preferably, the Digital Text Library conforms to digital library best practices, 

as the same change from time to time, in order to maximize the likelihood that the 
DTL provides useful a useful resource database, rather than simply a mass of data. 
To this end, it is currently preferred that the DTL implement Z39.50 WAIS standards 
for accessing and retrieving free text data. 

25 As indicated above, the Soapboxes, items in the Dumpster and items in the 

Archives preferably are scored based on their value to the users. Each such 
resource preferably is ranked each week based on user ratings. Although such 
rankings can be performed in a number of different ways, the following describes a 
ranking system in the preferred embodiment of the invention. 

30 Each item may be assigned a fixed number of points, such as 1 , either each 

time it is accessed, each time it is accessed by a unique individual, each time it is 
accessed by a unique individual over a given period of time (e.g., a maximum of 1 
point per unique user per day), or using any other system that assigns a 
predetermined number of points based on access alone. 

35 It is also preferred that users are allowed to rate the utility of the resources 

that they access. For example, users may be given the following options for rating 

resources, with the point values for each option indicated: 
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-2: Terrible 

-1: Poor 

0: Neutral 

+1 : Good 

5 +2: Excellent 

The point values may or may not be disclosed to the users. A failure to rate 
preferably results in a point value of 0. Preferably, the point values from such ratings 
are added to the point values from access alone, although it is also possible to 

10 assign points for access only or for ratings only. Such point values might be used 
directly to rank the various resources. However, in the preferred embodiment of the 
invention, the point values originating from users who are deeply involved in the 
website are given more weight than the point values originating from less involved 
users. In the preferred embodiment of the invention, this is accomplished by 

15 evaluating each user's activity over an Assessment Period (e.g., the previous 90 
days) and assigning the user an "Intensity Budget" (IB) based on such activity, such 
as follows (assuming 90-day Assessment Period): 

a^* num_forecasts)*^°*(l+ aj* soapbox_ actvity) 

(l + a2* resource_ activity)^' * (l + forecast_ score)^' * 
(1 -f a4 * annual_ fees_ paid) ^ * (l + ag * num_ club_ forecasts) ^ * 
(1 + a^ * ad baimer_ clicks)*'* * (l + a^ * num_ referred_ customers) ' 
(1+ ag* cBucks earned) ^* af 



20 where: 

num_forecasts = the number of forecasts made by the user during the 
previous ninety days; 

Soapbox_activity = number of hits by the user (maximum of 1 per hour) during 
the previous ninety days (i.e., ranges from 0 to 2160); 
25 resource_activity = number of resources used by the user (maximum of 1 per 

hour during the previous ninety days (i.e., ranges from 0 to 2160); 

forecast_score = maximum, over all challenges entered, of the means of the 
percentile scores for each challenge entered 
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annual_fees_paid = the current amount of annual fees paid by the user; 

num_club_forecasts = the number of forecasts made in the past ninety days 
by clubs while the participant was a member of such clubs 

ad_banner__clicks = the number of advertisement banner clicks by the user in 
5 the previous ninety days; 

num_referred_customers = the number of new paying customers referred by 
the user in the past ninety days; 

cBucks_earned = the amount of cBucks earned by the user in the past ninety 

days; 

10 all aj, bj are real numbers; initially it is preferable that aj = 1 .0, b© = 1 .5, b^ = 

1 .0, and all other bj = 0; however, these parameters preferably are changed based 
on experience; for example, any or all of such parameters might be incremented by 
0.01 until optimal values are determined; 

a and y are real numbers and initially it is preferable that a and y = 1 .0; 

15 however, these parameters preferably are changed based on experience; for 
example, either or both of such parameters might be incremented by 0.01 until 
optimal values are determined. 

Each user's IB then preferably is divided by the count of the number of items 

20 that the user rated during the Assessment Period to generate an "Intensity Weight 
(IW)". The point values assigned by a user (either for access alone, ratings alone 
or both) are then multiplied by the Intensity Weight to generate modified points. By 
so doing, those who are most involved with the site are given the most weight in 
determining the value of rated items. 

25 In addition, these modified points may be further modified according to a 

possibly nonlinear (and possibly asymmetric) transformation function. For example, 
the values may be weighted by their square (but maintaining the sign of the rating), 
placing more weight on extreme values (and opinions). It is noted that this further 
transformation may be performed either without applying the IW weighting, before 

30 the IW weighting is applied, or after the IW weighting is applied. 

In addition, the number of points assigned as a result of a user's ratings might 
be modified based on the user's ratings history. Thus, for example, users whose 
ratings typically do not exhibit much dispersion might be spread out relative to others 
whose ratings are more disperse. Similarly, users whose ratings exhibit a bias 

35 relative to the norm might be adjusted so that the user's central tendency is more 
aligned with the group norm. 



56 



35512-00006 




For the sake of simplicity, any references hereafter to the term "points" shall 
include any modifications described above. 

The points described above may be used directly to rank the resources 
against each other. However, doing so would likely result in significant week-to-week 
5 fluctuations that might not accurately reflect the long-term usefulness of the various 
resources. Accordingly, in the preferred embodiment of the invention, such rankings 
are performed by taking into account the total number of points received by each 
resource over time, with the number of points further back in time given less weight 
than points received more currently. For example, the points received by a resource 
10 might be converted into a score according to the following formula. 

25 

Score = Z a^e"^ 

1=0 

W where t is the week number (i.e., 0 corresponds to the past week, 1 corresponds to 

^ two weeks ago, etc.), a^ = the sum of all points during week t, and r = a real number 

^ which may be chosen based on how quickly one desires to devalue prior weeks' 

m 1 5 points; in the current embodiment r = 0.1 . Similarly, the upper limit for t may also be 
varied. 

^ After determining scores, such as in the foregoing manner, the various 

2 resources can be ranked against each other. Typically, Soapboxes will be ranked 

J against other Soapboxes, Archive items will be ranked against other Archive items, 

^ 20 and Dumpster items will be ranked against other Dumpster items. Such scores, 
rankings and/or points can be used to identify the top items or Soapboxes, to 
compensate Soapbox Proprietors, to promote items out of the Dumpster and into the 
Archives, and/or for a variety of other purposes. 

In this regard. Soapbox Proprietors may be compensated in any of a variety 
25 of ways. For example, a Proprietor may be given a fixed monthly stipend (such as 
50 cBucks) and/or also may earn additional compensation based on the Soapbox's 
current score (e.g., (1 + score) * 0.0001), the total number of points over a given 
period of time, and/or the Soapbox's ranking in comparison to other Soapboxes. The 
following is an example of one technique for rewarding Proprietors based upon the 
30 ranking of their Soapboxes, where the rankings are determined and the following 
compensations paid each month: 



Top 5%: $800 per month + Advanced Service + 200 cBucks 
Next 10%: $400 per month + Advanced Service + 100 cBucks 
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Next 20%: 
Next 40%: 
Other: 

5 In addition to a number of Soapboxes that depend upon their ratings for their 

continued survival, there may also be included a number of Soapboxes that are 
available to paying Proprietors ("commercial Soapboxes"). The price for obtaining 
such commercial Soapboxes might be fixed or might be determined based on an 
auction of such commercial Soapboxes. Although the ranked and commercial 

10 Soapboxes might be available to the general public without first accessing the 
contest website, it is preferable to restrict the availability of at least some of the 
Soapboxes so that they are accessible only through the contest website. 

The above rankings might also be used to designate items in the Archives 
according to their popularity or usefulness. For example, there might exist a 

15 separate section of the Archives that contains only the top 40. Alternatively, or in 
addition, the rankings might be used to prioritize items located pursuant to a keyword 
or other search of the Archives. Furthermore, the rankings themselves might be 
used as a search criterion for obtaining items from the Archives (e.g., to retrieve 
published articles about combination forecasting, but only those in the top 25% of the 

20 rankings). 

The rankings may also be used for Dumpster items in the same manner as 
for items in the Archives. In addition the rankings can be used alone or in 
combination with other variables to determine when to promote an item out of the 
Dumpster and into the Archives. For example, the top x% of the Dumpster items in 

25 each week might automatically be promoted into the Archives. Alternatively, 
promotion might require an item to be in the top x% for a specified minimum number 
of weeks. Similarly, promotion might be based on achieving a specified minimum 
number of points, a specified minimum score, or a specified minimum of either over 
a predetermined minimum period of time. 

30 In the foregoing manner, the present invention allows users to participate in 

determining the types of resources that are available to them over a website, thereby 
helping to insure that the website content stays relevant to the end users. 

Combination Forecasting Using Clusterization 

35 In addition to allowing participants and third parties to compare the prediction 

accuracies of the various participants in a wide variety of categohes, the contest 

described above also results in an enormous database of prediction data. 
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Calculating everTfexisting statistical measures based on the data in such an 
enormous longitudinal database can provide information that is qualitatively different 
than the information that is available when obtaining similar statistical measures 
based on forecast data from smaller, more homogenous groups. In addition, the 
5 present invention also provides certain novel processing techniques for generating 
new statistical measures and for creating improved combination forecasts. 

Although in the preferred embodiment of the invention the database is 
generated from a forecasting contest, any other method may be used to obtain a 
large quantity of financial and economic forecasting information from a very large 
10 longitudinal forecast panel (e.g., thousands, tens of thousands or even hundreds of 
thousands of participants). Whatever technique is in fact utilized, such information 
generally will share a common problem. Specifically, such a large number of 
forecasters typically cannot be expected to participate at the same level or at the 
same times. Thus, individual forecasters may come and go, and each forecaster 
S 15 typically will participate according to his or her own schedule, which often may not 
^ be fixed or regular. Although some forecasters will submit predictions regularly, 

m others may submit only sporadically. These problems are particularly troublesome 

L, in combination forecasting, which conventionally attempts to weight the predictions 

c for each forecaster based on performance over a period of time, thus requiring a 

S 20 consistent pool of forecasters. 

^ In order to cope with the foregoing problems, conventional combination 

^ forecasting techniques often simply discarded much of the sporadic forecast 

information, as well as forecast information from participants who did not participate 
during the entire time period of interest. This approach has severely limited the 

25 effectiveness of performing large scale combination forecasting, to the point that 
combination forecasting has tended to focus on relatively small groups that could be 
counted on to consistently provide predictions. 

The present invention overcomes these difficulties, thus permitting large scale 
combination forecasting, in the following manner. First, participants are grouped into 

30 clusters based on similarities of their predictions. Specifically, it is noted that with a 
massive forecasting panel, there is likely to be significant redundancy among the 
individual forecasts, as people rely on similar newsletters, broadcasts, or forecasting 
methodologies. Utilizing cluster analysis, a standard statistical grouping method, in 
an innovative manner, the present invention is able to take advantage of these 

35 forecasting redundancies to address the nonparticipation problem when computing 
optimal nonlinear combination forecasts. 
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Next, forecast statistics are determined for each cluster. Finally, each cluster 
statistic is weighted (based on dispersion within the cluster and historical accuracy 
of the cluster) and the cluster statistics are combined to produce a combination 
forecast. In this manner, the cluster statistics can still be used even if the individual 
5 participants in the clusters vary over time. 

Additionally, in order to cope with new participants, formulas are determined 
for assigning participants to the clusters based on their personal characteristic 
information. Specifically, formulas are sought which result in clustering that is as 
close as possible to the clustering that was obtained based on the forecasters' 

10 predictions. Once these formulas have been obtained, new participants can be 
assigned to a cluster based solely on the personal characteristic information that 
they have provided. Preferably, participants are periodically also reassigned to 
clusters (i.e., the clusters are re-formed), and the corresponding formulas for 
assigning new participants to clusters recalculated, in order to reflect societal 

1 5 changes over time. 

The foregoing technique is described in more detail with reference to Figure 
1 1 . Briefly, according to Figure 1 1 , clusters are formed, cluster assignment formulas 
are calculated, cluster statistics are generated, and then the cluster statistics are 
weighted and combined. Each time new combination forecasts are desired, the 

20 current participants are divided into the appropriate clusters and the foregoing 
generating, weighting and combining steps are repeated. In addition, periodically, 
new clusters are formed and new assignment formulas calculated. 

In more detail, in step 90 of Figure 1 1 new clusters are formed based on the 
prediction values of the individual participants. These cluster identifications 

25 preferably are done only on the basis of the forecasts themselves. Cluster Analysis 
algorithms (such as are available in Systat and numerous other multivariate statistics 
computer programs) attempt to group the data into clusters such that the measured 
distance between individual data points within each cluster is a minimum, but also 
such that the measured distance between two clusters is maximized. In other words, 

30 cluster analysis attempts to group data points so that the groups are as much alike 
as they can reasonably be, but also so the groups are as reasonably different from 
other groups as they can be. 

There are numerous standard methods for clustering data which could be 
employed, including: discrimination functions, factor analysis, and grouping 

35 techniques such as iterated Chi-Square and maximum-distance measures. 

In the preferred embodiment of the invention, vectors of forecasts for each 
individual are used as the columns in a matrix, with each row associated with a 
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particular forecasTdate. The individual forecasters are clustered using Systat or a 
similar program. More preferably, the currently preferred method is the KMEANS 
statistical procedure included in statistical packages such as SYSTAT and the S+ 
statistical modeling language. In this case, the forecast data matrix preferably is 
5 constructed as an (n x p) matrix, with n forecasters and p possible forecasts to be 
reflected by the cluster; if p equals 1 , then unique clusters are computed for each 
forecast; if unique clusters are identified for each regular time horizon, then p would 
equals. Initially, p will be set to 1 . 

The KMEANS algorithm splits the n forecasters into groups by maximizing the 

10 between group distance and minimizing the within group distance. While there are 
numerous possible distance measures which could be used, such as Pearson 
Product Moment Correlation, Sum of Squared Deviations, and Rsquared (1 - 
Squared Pearson Product Moment Correlation), the preferred embodiment uses the 
Minkowski distance, the z-th root of the mean z-th powered coordinated distance, 

15 with an initial parameter z = 2. This will result in g clusters being created. 

It is noted that a different set of clusters may be generated for each possible 
category (e.g., one cluster for short-term Microsoft stock, one cluster for long-term 
Microsoft stock, one cluster for long-term DJIA), where each category is a different 
variable/time-frame combination. However, more preferably, at least some of the 

20 sets of clusters will be formed based on predictions over multiple different categories 
(e.g., short-term DJIA, short term price of Microsoft stock and short-term NASDAQ 
index). The optimal combinations of categories to use for forming the various 
clusters, as well as the categories for which those clusters will be used in forming 
combination forecasts, can be determined empirically by mining the database using, 

25 for example, neural network techniques. 

In step 191, the cluster assignments formed in step 190 are statistically 
associated with demographic and other personal characteristic information, such as 
Internet or specific website (e.g., the contest website) usage patterns. For example, 
the information for each of a number of personal characteristic traits can be first 

30 converted into quantitative data in a predetermined manner. Next, a parametric 
equation that includes the personal characteristic variables, together with the still 
unknown parameters, is constructed. Such a parametric equation might, for 
example, be a simple linear combination of the personal characteristic variables. 
Finally, the values of the parameters are determined in a manner so that the 

35 mapping based on the personal characteristic data as closely as possible matches 

the clusterization based on the forecast similarities. Such optimization can be 

accomplished using linear or non-linear regression techniques, such as by finding 
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the parameters that result in minimum squared error, or by using any other 
optimization criteria. The resulting model will be used to provide preliminary cluster 
assignments for new forecast participants. 

Using multinomial logit regression, such as implemented in Systat and other 
5 multivariate statistical programs, the best assignment formulas can be computed 
which relates the demographic and other variables to the cluster assignment. 
Alternatively, for example, using Classification and Regression Tree techniques, 
such as implemented in SPSS and other multivariate statistical programs, 
assignment formulas based on the demographic variables can be determined. Still 

10 further, for example, using Chi-Square interaction detection, such as implemented 
in SPSS and other multivariate statistical programs, assignment formulas based on 
the demographic variables can be determined. 

Multinomial logit, CART, and CHAID techniques are among numerous 
multivariate techniques which can be applied to solve the assignment formula 

15 problem, but currently the preferred embodiment utilizes multinomial logit because 
it is believed that better statistical interpretations can be made from the resulting 
equations (for example, the interpretation of odds ratios which allows the direct 
evaluation of the relative importance of different variables as assignment predictors). 
For example, once the cluster assignments are made based on the (n x p) 

20 forecasting matrix, the (n x 1 ) cluster assignment vector can be appended to the (n 
X k) forecaster characteristics matrix containing the k characteristics (demographics 
and subscription variables). Using the k characteristics, a mathematical function can 
be estimated in which the (n x k) characteristics matrix is used to predict the value 
of the (n X 1 ) cluster matrix. This will be a nonlinear function estimated using multiple 

25 logit regression on the g possible cluster values, a statistical technique similar to 
regression. 

As a robust check to the multiple logit regression analysis, a genetic algorithm 
can be applied using a standard implementation such as the Palisade Software "Risk 
Optimizer" or the S+ Genetic Algorithm Library to check for other solutions to the 
30 problem of mapping the characteristic matrix onto the cluster assignment vector. By 
using the multiple logit regression weights as initial values for the Genetic Algorithm 
assignments, the multinomial logit likelihood function can be evaluated repeatedly 
to ensure that the results are global rather than local optima. 

The resulting multiple logit regression model will be used to give interim 
35 cluster assignments to new forecasters until new cluster assignments are computed. 

In step 192, various cluster statistics are generated for each of the clusters 

formed in step 190. Specifically, a number of clusters will be associated with each 
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variable for which a combination forecast is to be generated. Thus, if a combination 

forecast is desired for the short-term DJIA, statistics will be generated from the set 

of clusters associated with that prediction category. Preferably, these statistics also 

include a measure of central tendency for the cluster forecasts, such as the median 

5 or the trimmed mean, computed using an optimally computed trimming function, with 

the trimming thresholds established to minimize the mean-squared forecast error for 

each forecast time horizon for each cluster. This will result in a cluster forecast 

which will contain representative information from the cluster, but without the need 

for each individual to be frequently updating forecasts. In addition, various 

10 dispersion measures can be computed for each cluster, such as the standard 

deviation or the expectational uncertainty measure (EUM) - defined here as the 

range of the dataset after trimming, as a percentage of the median. 

Q In step 195, the cluster statistics are weighted and combined to produce 

kl combination forecasts and other statistical indicators. Specifically, the measures of 

Sr{ 15 central tendency preferably can be used as the predictor variables in optimal 

nonlinear forecast combination equations which combine the information across the 

S clusters in a way that minimizes mean-squared forecasting error or other loss 

1,^ function. Functions of the measure of dispersion within a cluster may be used to 

^ determine whether the given cluster should be given relatively more or less weight 

^ 20 in the optimal combination forecast. For example, when a cluster is more "tight" 

^ about its central tendency, that cluster will be given more weight. When it is more 

disperse, that cluster will be given less weight. 

For example, using the optimal clusters and the statistics derived from them, 

including central tendency and dispersion statistics, a nonlinear model with 

25 endogenous parameters is readily estimated. In one example, the model is a fourth 

order Taylor Series expansion around the dispersion statistics for the various 

clusters. The Taylor Series coefficients can then be determined using a regression 

technique based on historical accuracies of the clusters. As a result, the weight 

given to a particular cluster in this example varies based on a function of the 

30 dispersion statistic for the cluster and based on historical accuracy of the cluster. 

Moreover, using different clustering for different categories, the specific weighting 

can be specific to each category (i.e., each forecast variable/time-horizon 

combination). Similarly, based on historical values of cluster forecasts and 

realizations, an optimal linear aggregation equation can be readily estimated for 

35 purposes of producing aggregate forecasts for particular forecast horizons. 

For example, a linear combination method similar to the Granger-Ramanathan 

technique can be used to compute a linear regression with the historically realized 
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values of the target series as the dependent variable and with the histohcal cluster 
means (or medians) as the independent variables. The result is an optimal linear 
forecast combination of the cluster values. 

Numerous other nonlinear functions can also be implemented. A particularly 
5 useful nonlinear forecasting combination method which allows for regime switching 
can be implemented as follows. Use the same dependent and independent 
variables as in the linear method described above. In addition, allow for the forecast 
combination weights to vary as functions of other forecasts as well as other cluster 
statistics. 

10 If the coefficient on the i-th forecast is (li, then fii is a constant in the linear 

model but is a function here. One implementation is as follows: 

Q fli = (aO + a1*(meani - mediani > ct)i)*(meani - mediani) + (a2*(ai) + 

y a3*(ai)^2)*(ai>Qi) + (a4*{Forecast Change in Stock Index > li) ) + (a5*(Forecast 

S 15 Change in Stock Index <ri) )) + ... 

^ where li, I'l, Oi, and Qi are iteratively estimated threshold parameters, oi is the 

measure of dispersion within the i-th cluster, and meani and mediani are the mean 
.0 and median of cluster i's forecasts. In this model, the combination weight for cluster 

^ 20 i begins with its linear weight, which is adjusted by the difference between the mean 
C and the median (one measure of asymmetry in the forecast distribution) if the 

^ difference exceeds some threshold, by the first two terms of a Taylor series 

expansion with respect to dispersion, if dispersion exceeds some threshold, and by 
a shift factor if the expected stock market change either exceeds or falls below 
25 separate threshold levels. Additional terms in the coefficient equation can include 
the Expected Uncertainty Measure, higher moments of the cluster forecast 
distribution, and/or the magnitude of historical forecast errors. 

In step 196, it is determined whether a new combination forecast is required 
for a particular category. If so, in step 1 98 the participants whose predictions are to 
30 be used in the combination are sorted into clusters, preferably based on the most 
recent clusterization for the particular variable under consideration and (for 
participants who were not included in that clusterization) by using the assignment 
formulas calculated in step 191 . In certain embodiments, it is possible to exclude 
certain new participants in cases (i.e., certain combinations of personal characteristic 
35 data) where it has been determined that the assignment formulas are less reliable 
at assigning participants to the appropriate cluster and to include new participants 
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only within personal characteristic regions where the results from the assignnnent 
formulas and from the forecast-based clusterization are more highly correlated. 
Alternatively, it is also possible to sort all the participants into clusters based on the 
assignment formulas. Upon completion of such sorting, steps 193 and 195 are 
5 repeated. 

Instep 199, it is determined whether clusterization is required. This will be the 
case where a combination forecast is desired for a new category. Re-clusterization 
also preferably will be performed periodically for existing categories so as to reflect 
changing attitudes, etc., with the interval between re-clusterization being determined 

10 empirically. If clusterization is required, the process returns to step 190. 

In addition to use in connection with combination forecasting, the relating of 
the relative statistical weight of each cluster to its associated demographics, if any, 
may also provide powerful marketing information about which demographics have 
the highest contribution to forecast accuracy. For example, one could use such 

1 5 information to target job candidates or new participants for the forecasting contest. 

Forecasting Using Interpolation Modeling 

By utilizing interpolation model forecasts, the combination forecasts calculated 
using the technique described above can be used to forecast other variables not 

20 specifically forecasted or can be used when the number of participants submitting 
predictions for such othervariables is insufficientto provide a statistically meaningful 
combination forecast. Specifically, a price interpolation model can be fit for a 
variable, such as a common stock price or other asset price, based on 
contemporaneously available forecasts of other variables (e.g., prices of other stocks 

25 but not the target stock). The resulting interpolation model forecast provides a 
baseline forecast given stable relations in the market and can be used to provide 
initial stock forecasts. 

This approach estimates the value of a particular variable (e.g., the price or 
value of an asset) using regression analysis and independently produced forecasts 

30 for other variables (referred to herein as predictor variables). Initially, a regression 
technique (preferably, stepwise linear regression) is performed to find a best fit 
between previously predicted values for the predictor variables (which are different 
from the target value) and the historical realized values for the target variable. 
Preferably, the previously predicted values for the predictor variables (such as 

35 previous combination forecasts for those variables) are predicted for time points that 
are the same, or at least contemporaneous with, the time points associated with the 
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historical values of the target variable. 

Upon completion of the regression analysis, it may be decided to utilize some 
or all of the predictor variables to predict the value of the target variable, based on 
how closely the predictions for each predictor variable were correlated with the 
5 historical valuesof the target variable. For example, where the correlation is below 
a specified minimum threshold, the subject predictor variable may automatically be 
excluded. Currently predicted values for the remaining predictor variables (such as 
current combination forecasts) are then plugged into the forecast model 
corresponding to the regression technique utilized, together with the parametric 

10 values identified when performing the regression analysis (e.g., weighting 
coefficients), in order to obtain a forecast for the target variable. 

Thus, if there are (n+m) stocks being considered for forecasting, (n+m-1) 
stocks can be considered as possible predictors for the (n+m)-th stock. For 
example, a data matrix can be created in which the first column is comprised of the 

1 5 historical val ues actually observed for the target stock (with each row associated with 
a unique observation period). The remaining columns can then be populated with 
forecasts for each of the other predictor candidates, such that the forecasts are 
associated with realizations in the same time period as the target variable. Stepwise 
linear regression is then applied to identify the n stocks of the (n+m-1) predictor 

20 candidates which provide the best fit to the realizations of the target. 

The resulting Interpolation Pricing Model (IPM) uses the forecasts of the n 
stocks to produce a forecast of the (n+m)th stock. In this fashion, quasi-consensus 
forecasts for a large number of stocks can be computed without the need for a 
specific forecast from the forecasting panel. This quasi-consensus forecast will likely 

25 not be as reliable as a forecast obtained using true consensus methods. In part, the 
quasi-consensus forecast is based just on "non-firm-specific" information, the price 
information which is common to the industry (or those securities found to be most 
related to the target stock). To the extent that individual stock forecasts include 
components associated with firm-specific information, these individual stock 

30 forecasts will tend to be more accurate than the Interpolation Pricing Model. When 
both types of forecasts are available, the difference between the two forecasts is a 
measure of the prediction of the present value of firm-specific information; it indicates 
the amount by which the stock in question is expected to over-perform 
(underperform) the industry. Thus, the interpolation model forecast can be used to 

35 provide additional information even about variables for which there are an adequate 
number of participants submitting predictions. 
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As indicated above, the forecast error in the IPM will be due primarily to firm- 
specific information, both anticipated and unexpected. The forecast error in the 
consensus forecast is due primarily to unexpected firm specific information. 
Therefore, the Expected Unique Information Measure is the difference between the 
5 median consensus forecast and the Interpolation Model Forecast, a dollar estimate 
of the present value of the expected firm specific information. 

The Firm Specific Information Measure is the difference between the realized 
value and the Interpolation Model Forecast, e.g., for stock pricing applications, a 
dollar estimate of the present value of the actual firm specific information. The 
10 Unexpected Firm Specific Information Measure is the difference between the 
realized value and the median consensus forecast. Each of these measures allows 
for parsing new information into expected versus unexpected, and firm-specific 
versus industry-wide. Such parsings are important for financial analysis of the 
impact of information such as in the litigation of securities fraud class action suits. 
15 However, the IPM can be useful even when there is not an independent 

consensus forecast for comparison. The IPM can act as a surrogate forecast. Using 
the interpolation model forecasts, quickly updated consensus based forecasts can 
be computed even for stocks and indices which have inadequate current forecast 
participation. 

20 The estimation of the stepwise linear regressions used to form the 

Interpolation Pricing Model can be accomplished using many standard computer 
programs, including Systat. The comparison of forecast errors can be accomplished 
using many standard computer programs including Excel and Systat. Similarly, the 
computation of the IPM forecasts can be readily performed using a hand calculator, 

25 spreadsheet, or statistical program such as Systat. 

The IPM forecast should do better than traditional stock forecasts because of 
the flexibility inherent in the underlying consensus forecasts (people can adjust their 
predictions more quickly than a computer algorithm can be recomputed). However, 
ordinarily one would not expect the IPM forecast to exceed quality consensus 

30 forecasts because of the different roles played by expected firm-specific information. 

Additional Statistical Measures 

In addition to providing combination forecasts using clusters, as described 
above, a number of other statistical measures preferably are calculated from the 
35 database of predictions. Such measures might include, for example, any or all of the 
following. 
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- Overall median forecasts for each of the dozens of variables predicted in the 
games and the Special Challenges. This statistic can be calculated over all 
forecasters, over all participants in each Universe, or for various other groups of 
participants. It can function as one measure of central tendency. 

5 - Expectational Uncertainty Measure (EUM): (85th percentile - 15th 

percentile)/Median— this provides a measure of the value of the uncertain range 
around the forecast expressed as a percentage of the group forecast; this can be 
monitored overtime and used to indicate breaks in expectational information. Note 
that the statistic ranges from zero (with no difference between the 85th and 1 5th 
1 0 percentiles) to potentially infinity. This statistic can be calculated over all forecasters, 
over all participants in each Universe, or for various other groups of participants. It 
can function as a measure of dispersion of the subject predictions. 

- Expectational Uncertainty Measure per Thousand: the EUM computed for 
every thousand forecasts. 

1 5 - Intraday EUM Oscillator: the ratio of the EUM of the most recent thousand 

forecasts to the EUM for the current daily overall (equal to 1 for the initial 1000 
forecasts). 

- Mean time per thousand forecasts: a flow indicator showing how frequently 
forecasts are being updated. 

20 - Mean percentage change within day: a measure of the average percentage 

by which current-day entries have been adjusted from yesterday's final value to 
today's current value; this is a measure of perceived new information content. 

- Recent absolute percentage change per thousand: the absolute value of the 
percentage change from the previous thousand's median to the current thousand's 

25 forecast median; this is a measure of intraday stability of the forecasts. 

- C-Squared Statistic: the forecast "confidence" statistic; for any individual 
projection, take the absolute value of the revision from the previous day's entry to 
today, and divide this by the sum of sequential absolute revisions for each revision 
during the day. Square the ratio. Note that each "revision" is compared to the 

30 previous observed value in the day. If there are no revisions from yesterday, then 
C-Squared is defined to be 1. If there is only a single revision from yesterday to 
today, then C-Squared will equal 1; if there are numerous revisions, but all in a 
"monotonic" path, C-Squared will equal 1. If there are numerous nonmonotonic 
revisions, then C-squared will approach zero. C-Squared is an indicator of the 

35 stability of information. For example: yesterday's final forecast was 1 0; today began 
with 9, then finished at 12. The C-Squared statistic is: 
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( |12 - 10|/( |9 - 10| + 112 - 9| ))^2 = (2/(1+3))^2 = * = % . 

- L- Statistic: a "leakage" measure, equal to 1/C. Take the sum of the 
absolute revisions from the previous clay's entry to the first of today's, the first of 

5 today's to the second, and so on; this is the ratio's numerator. The denominator is 
the absolute revision from yesterday's final value to today's final value. 

- Intraday forecast median trajectory: compute the intraday forecasting 
patterns, looking at the median per thousand forecasts, expressed in a percentage 
basis with previous day's overall median as 100. 

10 - Intraday Forecast Oscillator— compute the ratio of the median of the most 

recent thousand forecasts to the current daily median overall (equal to 1 for the initial 
1000). 

- Forecast Momentum Index: the recent absolute percentage change per 
thousand divided by the mean time per thousand forecasts. As there is little change 

1 5 in the median forecast, the Forecast Momentum Index goes to zero; as there is little 
forecasting activity, the Forecast Momentum Index goes to zero. As there is either 
a large change in the median or a large change in frequency of forecasting, the 
Forecast Momentum Index grows and can go to infinity. 

- Market Volatility Measures: the standard deviation of the forecasts of the 
20 various market indexes; this could be a rolling average of standard deviations per 

thousand forecasts, or it could be an actual calculation based on all the current 
forecasts active during the given day. The Forecast Volatility Curve is the plot of the 
standard deviations across the forecast horizon, preferably from the end of next 
week to a year from now. Note that statistical curve fitting methods (e.g. nonlinear 
25 curves, cubic splines) can be applied to interpolate the relevant volatility measure for 
any time horizon along the curve given the key points included in the samples. 

- Enthusiasm Statistics: first generate median forecasts for each of the 
variables by each of the teams in the Challenge and an overall median; next, 
generate median forecasts according to geographic groupings and also according 

30 to other demographic variables. The ratio of the median by the subgroupings to the 
overall median is a measure of relative confidence or enthusiasm. 



It is noted that the L-Statistic, C-Squared Statistic, and the Forecast 

Oscillators can be applied to other time horizon situations as well. Breaks in the L- 

35 Statistic and the C-Squared Statistic values indicate changes in forecast sentiment, 

and may indicate other regime shifts; significant breaks (i.e., changes that meet 

some predetermined criteria, such as a predetermined threshold) can be reported 
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through the email and pager alert services. 

The following example will illustrate what one set of forecasts might look like 
as measured over two days and will also illustrate how several of the unique 
statistics described above can help interpret the evolution of the forecast data. Note 
5 that these are artificially constructed data for example purposes; while it is possible 
that real data would display these instabilities and rapid adjustments, it is likely that 
there would be significantly less intraday forecast revision than is displayed in this 
example. 

Suppose that the forecast deciles are listed below in the left column, and the 
10 observation periods are listed across the columns. The table entries might be the 
medians associated with the particular forecast decile as of the forecast date: 

Example Forecast Disthbution Data: 

15 



20 



25 



Observation: 


PREV. 


OPEN 


10 AM 


NOON 


4PM 


CLOSE 


OPEN 


NOON 


CLOSE 


percentile= 


CLOSE 


Day 1 








Day 1 


Day 2 




Day 2 


0.10 


75.76 


79.41 


82.07 


84.60 


86.89 


88.51 


90.14 


91.52 


92.76 


0.15 


76.90 


80.21 


82.84 


85.21 


87.45 


89.00 


90.60 


91.88 


93.06 


0.20 


77.75 


80.91 


83.53 


85.67 


87.93 


89.44 


90.96 


92.21 


93.32 


0.30 


79.18 


82.28 


84.54 


86.78 


88.79 


90.11 


91.58 


92.75 


93.75 


0.40 


80.78 


83.97 


85.77 


87.84 


89.79 


90.83 


92.29 


93.32 


94.30 


0.50 


113.69 


113.92 


88.29 


90.45 


108.51 


92.09 


93.76 


94.53 


104.28 


0.60 


118.97 


116.46 


114.08 


112.05 


110.48 


108.69 


107.59 


106.57 


105.72 


0.70 


120.84 


117.88 


115.25 


113.16 


111.37 


109.58 


108.28 


107.21 


106.21 


0.80 


122.44 


119.11 


116.38 


114.27 


112.25 


110.38 


108.93 


107.75 


106.67 


0.85 


123.17 


119.86 


117.04 


114.71 


112.68 


110.79 


109.28 


108.06 


106.96 


0.90 


124.10 


120.68 


117.77 


115.37 


113.21 


111.26 


109.62 


108.44 


107.32 


1.00 


135.15 


129.62 


125.41 


122.46 


117.99 


115.50 


113.04 


111.12 


110.04 



30 The meandering of the forecast itself is clear to anyone who has watched a 

stock ticker. The forecast at the previous close was 1 13.69; the forecast began up 
a little, ending the day at 92.09. The next morning, the forecast opened a little 
higher, then rose steadily throughout the day, closing at 104.28. As a measure of 
market expectations, this forecast series could be quite enough. However, there is 

35 much more that one can glean from the forecast distribution data. 

First, the Expectational Uncertainty Measure (BUM) can be used to measure 
whether there is a convergence or divergence in the forecast marketplace over time. 
The initial BUM (for the previous da/s close) is computed to be 40.7%. By 4:00 p.m. 
on day 1, the BUM has dropped to 23.3%. Note that although there is a major 
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Change in the forecast from 4:00 p.m. to Close, from 108.51 to 92.09, the EUM 
remains almost constant at 23.6%. 

The Expectational Uncertainty Measure indicates that some of the change in 
forecast from the previous close to the 4:00 p.m. value might be due to a tightening 
5 of the forecasts, rather than to significant new information. However, the forecast 
change from 4:00 p.m. to Close, accompanied by a nearly constant EUM, is directly 
attributable to new information which had a uniform impact across forecasters. The 
forecasters are collectively more certain by Day 1 Close than the previous day, and 
have incorporated new information into their collective prediction. Day 2 opens with 
1 0 an EUM of 1 9.9% and closes with an EUM of 1 3.3%. Whatever the resulting value, 
the forecasters have a tighter distribution. 

The C-Squared statistic gives a measure of the net movement of forecasts 
within a forecast period. It might be viewed as a measure of the directional efficiency 
of information in the marketplace. By construction, so long as all forecast changes 
15 continue in the same direction (e.g. continued downward revision or continued 
upward revision), the C-Squared statistic equals 1. Information may not be 
appearing instantaneously, as predicted by some financial theories, but at least what 
information is arriving continues in the same direction as predecessor information. 
To the extent that the forecasts see-saw throughout the day, there will be far more 
20 movement than actual end of the day net change. In such a case, the directional 
efficiency of the forecasts would be quite low, and the C-Squared statistic would 
approach 0 in value. 

In the above example. Day 1 began with C-Squared equaling 1 , by definition. 
The next observation, 10:00 a.m., produces a C-Squared of 96.4% as the forecast 
25 has dropped to its example low of 88.29. The statistic shows that about 3.6% of the 
movement happened from close to open, but that most of the forecast movement 
happened between open and the 10:00 a.m. measurement. As the forecasts start 
increasing, the large drop to 88.29 is increasingly revealed as a detour, detracting 
from directional efficiency. By noon, the C-Squared has dropped to 69% and 
30 continues its plummet, reaching 1 .2% by 4 p.m. However as the median forecasts 
drop further, the C-Squared recovers somewhat to 12%. 

The C-Squared Statistic together with the EUM Statistic indicate that there 
was significant, confused information being incorporated into the forecasts, but it was 
accompanied by a tightening of the forecast distribution even as wildly changing 
35 forecasts were being produced. In other words, the market was increasingly moving 
together even while being whipsawed by whatever was causing the forecast jumps. 
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Pricing Derivative Instruments 

One application of the internet-based consensus forecast is in the direct 
estimation of the statistical distribution associated with the market expectations of 
future outcomes. These distributions, directly measurable from the prediction 
5 database, are able to be applied as the a priori and the a posteriori distributions (for 
updating) in Bayesian estimators. An aspect of the present invention is the 
application of these empirically derived distributions to Bayesian estimators in the 
initialization, training, and operation of neural networks, of Bayesian neural networks, 
of adaptive filters, and of mixed estimation econometric models. 

10 These forecast distributions are also directly applicable to the estimation of 

various volatility measures, for options estimation purposes (as described below), 
and of broad classes of market sentiment measures, including submeasures 
according to various groupings of the forecast participants. For example, one could 
evaluate the market sentiment for those in urban East Coast in contrast to those in 

15 the rural Northern United States. 

Utilizing an enormous longitudinal database according to the present invention 
also can permit one to obtain fairly accurate measurements of certain quantities, 
which previously had to be estimated in a more indirect manner. Consider the 
problem of pricing a three-month call option on a stock currently selling at $50 if the 

20 exercise price (EP) is $55 (i.e., an "out the money" option). Existing pricing models 
require an estimate of the variance of the stock price over the next three months. 
Conventionally, historical data have been used to make this estimate. Thus, in a 
changing market, such conventional techniques are often inadequate. Moreover, 
these conventional models typically also assume that both the stock and the option 

25 trade in efficient markets. Hence, the expected price of the stock is assumed to rise 
over time only at some equilibrium rate of return. Assume that this rate is 8%, such 
that the expected price of the stock in 3 months is $51. Suppose further that 
information became available indicating that the value in 3 months should be $55. 
Under the efficient market assumption, the stock would immediately jump from $50 

30 to (about) $54 and the price of the (now less "out of money") call would jump 
correspondingly to re-establish the option pricing model relationship. Hence, the 
traditional view is that an increase in the expected return on the stock will cause both 
the stock price and the option price to rise, while an increase in the variance of the 
stock return will only cause the option price to rise (and may cause the stock price 

35 to fall - which would moderate the option price rise). 

According to the present invention, however, there is available a large number 

of estimates of the stock price at various time points throughout the three-month 
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period. The resulting distribution for any given time point, with the percentage of 
total number of forecasts on the vertical axis and the stock price on the horizontal 
axis is at least an estimate of the probability distribution function for the stock price 
during that time point. Hence, it is generally not necessary to use historical data to 
5 compute the future variance because the appropriate price for the option can be 
computed directly. Specifically, an estimate of the current price for the option can 
be determined by computing the area under the forecast distribution above the EP 
and taking a present value. 

Depending partly upon the actual number of predictions available, it may be 
10 more accurate to aggregate all predictions over the three-month period or to 
subdivide the three-month period into shorter time intervals (whose length also 
depends upon the number of predictions). In the latter case, the option price can be 
O estimated with respect to each shorter time interval, and then the maximum price so 

hj obtained (possibly after discarding certain outliers) can then be adopted as the 

^; 15 option's true price. While this technique might provide more accurate estimates 
H where a large number of predictions are available, if the number of predictions is 

D 

^ smaller it may be difficult to subdivide (or to subdivide beyond some minimum time 

^ period) and still obtain statistically meaningful results. In any event, by comparing 

S the option pricing model's implied variance to the value computed according to the 

y 20 present invention, or simply the current option price to the value computed according 
^ to the present invention, we can identify potentially over (or under) priced options. 

^ Nor is this all. One of the original reasons to get the stock price forecast was 

to try to identify stocks expected to under or over-perform. In other words, the 
procedure according to the present invention generally is not wedded to an 
25 assumption of market efficiency. We are thus able to allow both the stock and the 
option to be inefficiently priced and further determine (based upon the same - and 
hence at least consistent - forecast distribution) which is more inefficiently priced. 
This result will have clear implications for hedging (e.g., long one and short the 
other). 

30 The foregoing discussion can be easily extended to the valuation of other 

derivative instruments (i.e., instruments whose value depends upon the value of an 
underlying asset on a future date or dates). Specifically, by assuming that the 
distribution of forecasts for the value of the underlying asset at a given point in the 
future is the same as the probability density function for the asset's value at that 

35 point in time, it becomes a straightforward matter to determine the probability that the 
underlying asset will have any particular price at that point in time. It also generally 
will be a simple matter to determine the value of the derivative instrument if the 



underlying asset 7s assumed to have a given value at a given point in time. For 
example, in the call option example given above, the value of the derivative 
instrument is equal to the assumed value of the underlying stock minus the exercise 
price or zero, whichever is greater, discounted to present value. Accordingly, the 
derivative can be priced as follows: 



D= Y.D{UAyP(UA) 

all-UA 

where D is the value of the derivative instrument, UA is the assumed value of the 
underlying asset on the future date, D(UA) is the derivative's value given UA, and 
P(UA) is the probability of UA. It is noted that all possible values of UA can be used 
or else a coarser selection of discrete values of UA can be used, e.g., with each 
forecast being deemed to be the permissible value of UA to which it is closest. 

Because the value of many derivative instruments will depend not only on the 
value of the underlying asset at a single point in time, but rather over a range of 
times, the foregoing calculation can be repeated for a number of different time points 
in the applicable period. Then the value of the derivative instrument can be set to 
be the maximum over all such time points or can be selected in any other manner. 
For example, other techniques which take into account the likely risk in waiting to 
exercise the applicable rights under the derivative instrument, as compared to the 
likely reward in doing so, may be more optimal (i.e, biasing toward earlier exercise). 

Additional Analvtical Techniques 

A variety of additional sophisticated techniques based on the collected 
forecast data-warehouse, such as products based on cointegration techniques, can 
25 also be provided. Cointegration techniques are statistical methods used for the 
analysis of highly correlated data series such as stock prices. Several examples of 
such additional techniques are as follows. 

First, based on the distributions of the consensus estimates for the interest 
rate series, confidence bands can be estimated around the specified points on the 
30 yield curve for each of the future time horizons. With statistical curve fitting methods, 
a nonlinear yield curve can be estimated through the forecast points. With the 
empirical forecast distributions, one can perform resampling to estimate the 
confidence surfaces for any desired percentile. As a result, far better Value at Risk 
and bond-pricing analysis can be performed. Similarly, far better Value at Risk 
35 analysis for complicated derivatives and hedge products can be performed. 
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Based on the results of the periodic Special Challenge requesting the relative 
ranking of various types of investments, the resulting ranks can be matched against 
the participants' demographic variables in the database to provide investment 
allocation suggestions. Based on the expected price distributions for long term 
5 forecasts, a nonlinear optimization algorithm can be used (such as a genetic 
algorithm) to determine optimal portfolios given specific constraints and objectives. 
For example, applying a genetic algorithm model to these data will quickly identify 
the least risk portfolio for a given amount of new money investment, the maximum 
return portfolio, and the maximum return in given stock sectors. By integrating the 

10 Premium Sites with the forecast predictions, bonds and cash can also be included 
in the optimal portfolios. The application of the genetic algorithm to consider the 
forecast risk as measured by the consensus panel provides a powerful solution. 

Using randomly assigned clusters, stepwise regression can be applied to the 
realization series and the historical predictions and errors for each of the forecasters 

15 in these random clusters. The regression results will identify candidates for an 
"individual-based" model. The identified candidates can then be included in a large 
group which also can be analyzed using stepwise regression to identify an 
appropriate set of regressors. In this manner, improved forecasts can often be 
provided by using historical weighting of the predictions of individual participants. 

20 Traditional neural networks can be spectacular at finding patterns in the 

realization of data, but they require significant internal stability in the system being 
predicted because of the great length of time to train the network. Bayesian Neural 
Networks (BNNs) allow for the use of a priori statistical distributions on possible 
outcomes to train the network more efficiently. There are numerous innovative ways 

25 that the empirical forecast distributions associated with the present consensus panel 
can improve the performance of neural network systems. By using the forecast 
distributions across multiple time horizons, the Premium Site consensus panel allows 
for the simultaneous estimation of a priori and a posteriori distributions in advance 
of the realization. The neural network can be trained using the repeated forecast 

30 horizons as repeated iterations for training purposes, allowing the neural network to 
be trained to respond to newly perceived market relations far more quickly than in 
traditional models. Moreover, beside providing forecasts, the BNN approach can be 
used to determine improved combination weights for real time reweighting of the 
consensus panel. 

35 Another artificial intelligence approach to determine optimal combination 

weights, a genetic algorithm may be run real time to reweight a forecast combination 
equation based on the recency of each individual's (or cluster's) predictions as well 
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as the historical accuracy of that individual (or cluster). 

In the limited z-matrix weighting regime switching model, demographic 
variables are used along with economic forecast variables (GNP, U, r, P) to 
determine nonlinear regime switching parameters for individual forecast level 
5 combination equations. Similarly, economic forecast variables can be used to 
determine cluster weighting. 

Utilization of Banner Ad Click-Through Information 

As noted above, it is common for web sites to display banner advertisements 
10 ("banner ads") that also function as hyperlinks. However, in the past very little has 
been done to analyze the information regarding the number of banner ads to which 
viewers respond ("Click-throughs"). The following describes a mechanism utilizing 
S the click-through response information to provide additional valuable economic 

U information. 

m 15 A web site according to the preferred embodiment of the present invention 

^ internally categorizes banner ads by industry or economic group. For example, ads 

01 for mortgages would be grouped together, as would ads for automobiles. This 

g grouping model preferably includes categories as well as sub-categories (to as many 

ffl levels as necessary). Any sub-category can have multiple parent categories, and the 

m 20 link between sub-category and parent category preferably has a real-valued weight 
between 0 and 1 , indicating the level of representation of the sub-category within the 
parent. The weights of all sub-categories under a specific parent category preferably 
sum to 1. This model is a weighted acyclic directed graph. As examples, "Auto 
Accessories" might be represented as a subset of "Auto", and "Chain Restaurant" 
25 might be represented as a subset of "Food" and also as a subset of "Franchise 
Businesses" (preferably, when the weights are unspecified, their default value is 1 ). 

The web site preferably collects information on each click-through. 
Specifically, the number of click-throughs for each category and the number of ads 
for that category that were presented during a specific period (say, one week) are 
30 counted. Additionally, the data may be further subdivided into various demographic 
and expectational categories, such as geographic regions or a group of subscribers 
with certain beliefs or forecast expectations. The collection of click-through rates 
(click-throughs / ads presented, for each category) covering one period will then be 
compared to one or more prior periods (e.g., rate^urrent/ avg(rateprevious(')) to determine 
35 click-through indices which measure whether there has been a change in consumer 
sentiment for each category. For example, a click-through index for mortgage ads 
for individuals living in the Midwest that is greater than 1.0 would indicate an 
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increasing interest in mortgages within that region. Where a category has sub- 
categories, the aggregate values of click-throughs and ads presented for the 
category are calculated by summing the products of each sub-category's weight and 
click-throughs or ads presented. 
5 Additionally, the category click-through index can be compared to the click- 

through index for each individual ad within that category to provide independent 
measures of changes in market sentiment for specific products. Some example 
conclusions that can be drawn are: 



10 (product click-through index) / (category click-through index) > 1 .0 => 
Effective number of ad impressions and/or gain in market share 
(product click-through index) / (category click-through index) < 1 .0 => 
Ad saturation and/or loss of market share 

15 These indices (or other functions of the click-through rates) can also be utilized as 
additional variables for the statistical forecasting described above. For example, 
models can be estimated which use changes in the indices as leading indicators for 
broader economic measures (e.g., mortgage click-throughs may be a leading 
indicator for housing starts or GNP). The indices also can provide the foundation for 

20 additional consumer sentiment measures, even to the extent of analyzing differential 
industry performance. 

For example, click-through statistics (such as the indices described above) 
can be combined with the cluster statistics in order to provide enhanced combination 
forecasts. In this implementation, the weights assigned to the click-through statistics 

25 preferably would be determined in a similar manner as for the cluster statistics, i.e., 
based on the predictive accuracy of such rates in previous combination forecasts. 
Alternatively, click-through statistics alone could be used to generate forecasts or the 
click-through statistics could be combined with any other indicators to generate 
forecasts. 

30 Moreover, the click-through statistics can first be separated out into click- 

through statistics for different demographic groups or for groups sharing other 
common personal characteristics (such as by using the personal characteristic 
information obtained in the contest registration described above). Upon doing so, 
it is likely that the click-through statistics for certain groups will have greater 

35 predictive accuracy than for other groups. Accordingly, by appropriately selecting 
the groups to use, prediction accuracy can be further enhanced. The groupings can 
be made using the clusters described above that are generated based on the 
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individuals' predictions, based on ad hoc notions, or based on any other criteria. 

Preferably, however, new clusters are formed in the same manner discussed 
above, but instead based on the correlation between the participants' click-through 
rates and the variations in the subject variable. This technique should result in 
5 optimal or near optimal clusterization for the intended purpose. Also, assignment 
formulas can be generated (in the same manner described above) for assigning new 
participants to these clusters for purposes of categorizing their click-through 
information. 

Additional valuable information can be obtained by correlating: (1) click- 
10 through rates (i.e., numberofclick-throughs divided by the number of ads presented) 
or other click-through statistics with the demographic information or other personal 
characteristic information for the viewer; (2) click-through statistics for a viewer with 
the viewer's predictions; and/or (3) click-through statistics with the variable being 
predicted on the page on which the banner ad appears. In particular, this information 
15 can have important implications for targeting banner ads in the most effective 
manner. 

Finally, it is preferable to maintain saturation as well as penetration 
information. In other words, in collecting the click-through data, it is preferable to 
maintain and to utilize in the statistical analyses described above data that 

20 distinguish between the same respondents clicking repeatedly on similar ads and 
distinct respondents clicking on similar ads. The foregoing can be accomplished, for 
example, by ignoring click-throughs above a certain maximum (e.g., 1, 2 or 3) for the 
same individual, ignoring click-throughs above a certain maximum (e.g., 1, 2 or 3) 
for the same individual within a predetermined period of time (e.g., 1 month), giving 

25 less weight to additional click-throughs for the same individual, or giving less weight 
to additional click-throughs for the same individual within a predetermined period of 
time (e.g., 1 month). It is noted that the foregoing techniques are preferably utilized 
in connection with a registration process that permits the website operator to 
distinguish different individuals. 

30 

Network Environment 

Figure 12 is a block diagram illustrating the network structure of the 
environment in which the present invention operates, according to one exemplary 
embodiment. Shown in Figure 1 2 are participant terminals 231 and 232, which may 
35 comprise either an ordinary computer workstation, a laptop computer, or special- 
purpose computing equipment. Terminals 231 and 232 communicate with Internet 
service providers (ISPs) 241 and 242 via a telephone connection, such as by using 
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a modem interface. ISPs 241 and 242, in turn, connect to Internet backbone 250 via 
their respective routers (not shown). Specifically, ISP 241 receives Internet 
messages from terminal 231 and then routes them onto Internet backbone 250. 
Also, ISP 241 pulls messages off Internet backbone 250 that are addressed to 
5 terminal 31 and communicates those messages to terminal 231 via the telephone 
connection. In a similar manner, terminal 232 also can communicate over the 
Internet through ISP 242. 

Also connected to Internet backbone 250 is Internet server 260. As discussed 
in more detail below, one function performed by Internet server 260 is to interact with 

10 participant terminals, such as terminals 231 and 232, over the Internet in order to 
supply the participants with various informational resources and to accept prediction 
information from the participants. Internet server 260 then provides the prediction 
information, via local area network (LAN) 270, to various processing stations, such 
as stations 271 to 273. While Internet server 260 may be capable of performing 

15 some of the simple processing tasks, such as finding the median of the prediction 
data for each prediction event, the more complicated processing preferably is 
performed by one or more dedicated processing stations, such as stations 271 to 
273. 

Although terminals 231 and 232 are shown in Figure 12 as being attached to 
20 Internet server 260 via the Internet 250, other methods can also be used for 
communicating between remote terminals and the Internet server 260, such as by 
utilizing a direct modem/telephone line dial-in connection, a wide area network, a 
local area network or any other communication system. Furthermore, different 
terminals may be connected to server 260 via different communication systems. For 
25 example, individual computer workstations might connect to Internet server2 60 via 
the Internet 250, while terminals under common ownership with Internet server 260 
might communicate with Internet server 260 via a wide area network or a direct dial- 
in connection. Similarly, although Internet server 260 is shown in Figure 1 2 as being 
connected to the various processing stations using LAN 270, any other 
30 communication system may also (or instead) be used, such as a wide area network, 
local area network, Internet, or direct modem/telephone line dial-in connection. 

Svstem Environment 

Generally, the network nodes referenced above can be implemented either 

35 as a general purpose or a special purpose computer, either with a single processor 

or with multiple processors. Figure 13 is a block diagram of a general purpose 

computer system, representing one of many suitable computer platforms for 
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implementing the methods described above. Thus, the general purpose computer 
system illustrated in Figure 1 3 might be used to implement any of processing stations 
271 to 273, Internet server 260 or participant terminals 231 and 232. However, the 
system shown in Figure 1 3 is more preferably used only for Internet server 260 and 
5 various participant terminals, such 231 and 232. Because of the intensive 
processing requirements, the processing stations (such as 271 to 273) preferably are 
implemented as multi-processor boxes having a large amount of random access 
memory (RAM), such as 8 gigabytes. 

Specifically, Figure 13 shows a general purpose computer system 350 in 
1 0 accordance with the present invention. As shown in Figure 1 3, computer system 350 
includes a central processing unit (CPU) 352, read-only memory (ROM) 354, RAM 
356, expansion RAM 358, input/output (I/O) circuitry 360, display assembly 362, 
5 input device 364, serial port 382, modem port 384, and expansion bus 

yj 366. Computer system 350 may also optionally include a mass storage unit 368 

K 15 such as a disk drive unit or nonvolatile memory such as flash memory and a 
^ real-time clock 370. 

m CPU 352 is coupled to ROM 354 by a data bus 372, control bus 374, and 

L address bus 376. ROM 354 contains the basic operating system for the computer 

system 350. CPU 352 is also connected to RAM 356 by busses 372, 374, and 376. 
m 20 Expansion RAM 358 is optionally coupled to RAM 356 for use by CPU 352. CPU 
352 is also coupled to the I/O circuitry 360 by data bus 372, control bus 374, and 
address bus 376 to permit data transfers with peripheral devices. 

I/O circuitry 360 typically includes a number of latches, registers and direct 
memory access (DMA) controllers. The purpose of I/O circuitry 360 is to provide an 
25 interface between CPU 352 and such peripheral devices as display assembly 362, 
input device 364, serial port 382, modem port 384, and mass storage 368. 

Display assembly 362 of computer system 350 is an output device coupled 
to I/O circuitry 360 by a data bus 378. Display assembly 362 receives data from I/O 
circuitry 260 via bus 378 and displays that data on a suitable screen. 
30 The screen for display assembly 262 can be a device that uses a cathode-ray 

tube (CRT), liquid crystal display (LCD), digital flat panel, or the like, of the types 
commercially available from a variety of manufacturers. Input device 364 represents 
one or more of a keyboard, a mouse, a magnetic card reader, a bar code reader, a 
stylus working in cooperation with a position-sensing display, or the like. The 
35 aforementioned input devices are available from a variety of vendors and are well 
known in the art. 

Some type of mass storage 368 is generally considered desirable. However, 
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mass storage 368 can be eliminated by providing a sufficient mount of RAM 356 and 
expansion RAM 358 to store user application programs and data. In that case, 
RAMs 356 and 358 can optionally be provided with a backup battery to prevent the 
loss of data even when computer system 350 is turned off. However, it is generally 
5 desirable to have some type of long term mass storage 368 such as a commercially 
available hard disk drive, nonvolatile memory such as flash memory, battery backed 
RAM, PC-data cards, or the like. 

A removable storage read/write device 369 may be coupled to I/O circuitry 
360 to read from and to write to a removable storage media 371 . Removable 
1 0 storage media 371 may represent, for example, a magnetic disk, a magnetic tape, 
an opto-magnetic disk, an optical disk, or the like. Instructions for implementing the 
inventive method may be provided, in one embodiment, to a network via such a 
^ removable storage media. 

yj In operation, information is input into the computer system 350 by, for 

^ 15 example, swiping a magnetically encoded or bar-coded card through an appropriate 

^ card reader, typing on a keyboard, manipulating a mouse or trackball, or "writing" on 

m a tablet or on position-sensing screen of display assembly 362. CPU 352 then 

L, processes the data under control of an operating system and an application program , 

\Q such as a program to perform steps of the inventive method described above, stored 

^ 20 in ROM 354 and/or RAM 356, typically after downloading the program from mass 

^ storage 368. CPU 352 then typically produces data which is output to the display 

^ assembly 362 to produce appropriate images on its screen. 

Expansion bus 366 is coupled to data bus 372, control bus 374, and address 

bus 376. Expansion bus 366 provides extra ports to couple devices such as network 

25 interface circuits, modems, display switches, microphones, speakers, etc. to CPU 

352. Network communication is accomplished through the network interface circuit 

and an appropriate network. For example, the network interface circuit can connect 

through a hub (not shown) into an external router (not shown) for communication 

over a local area network, a wide area network or the Internet. Serial port 382 is 

30 coupled to input/output circuitry 360 and can provide external communication for 

computer system 350. 

Modem port 384 is coupled to input/output circuitry 360 and also can provide 

external communication for computer system 350. For example, by utilizing an 

internal modem (not shown) in input/output circuitry 360 and connecting modem port 

35 384 to an external telephone line (not shown), computer system 350 can connect to 

various modem-based computer dial-up systems, including systems provided by 

Internet service providers, which subsequently can connect computer system 350 
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to the Internet. 

Suitable computers for use in implementing the present invention may be 
obtained from various vendors. Various computers, however, may be used 
depending upon the size and complexity of the tasks. Suitable computers include 
5 mainframe computers, multiprocessor computers, workstations or personal 
computers. In addition, although a general purpose computer system has been 
described above, a special-purpose computer may also be used. 

It should be understood that the present invention also relates to machine 
readable media on which are stored program instructions for performing methods of 
10 this invention. Such media include, by way of example, magnetic disks, magnetic 
tape, optically readable media such as CD ROMs, semiconductor memory such as 
PCMCIA cards, etc. In each case, the medium may take the form of a portable item 
such as a small disk, diskette, cassette, etc., or it may take the form of a relatively 
larger or immobile item such as a hard disk drive or RAM provided in a computer. 



The business model of the present invention is certainly not limited to the 
economic and financial data of the developed world. Suppose one wished to 
estimate the GNP of Nigeria (or Cuba), where few records are kept and few of those 

20 are reliable. The consensus approach would certainly be cheaper, and probably 
more reliable, than the alternatives. 

In addition to estimation of commodity spot and futures prices, the above 
techniques can also be used in connection with crop forecasting. Going farther 
afield, forecasting of consumer and/or societal trends, such as popularity of different 

25 colors (for cars, appliances, etc.) or individual movies also can be forecast in a 
manner which could be improved by the inventive methods described above. 

Finally, the act of repeated surveys of a population of known identity and 
demographics has numerous interesting marketing applications, the least of which 
is targeted banner ads. Testing the evolution of new product reaction (through ads 

30 and/or surveys with cBuck incentives) would seem to offer great potential, particularly 
if the response information were analyzed in connection with the collected personal 
characteristic information. 

Generally speaking, the present invention provides an overall solution for 
gathering longitudinal prediction data and then processing that data to provide 

35 statistical estimates of various quantities. As described in more detail above, the 
data gathering aspect of the invention is implemented as a prediction contest, and 
can provide incentives for a large number of people and entities to participate on a 
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frequent basis. For example, in a preferred embodiment of the invention, 
participants are ranked and/or rewarded based on track record over a period of time. 
In this way, participants have significant incentives to provide accurate predictions, 
as contrasted with many conventional contests which may encourage 
5 gamesmanship by rewarding a participant based on prediction accuracy with respect 
to discrete events, irrespective of how poorly the participant may have done in 
previous events. A number of different inventive features are included within this 
solution. 

Thus, although the present invention has been described in detail with 
1 0 regard to the exemplary embodiments and drawings thereof, it should be apparent 
to those skilled in the art that various adaptations and modifications of the present 
invention may be accomplished without departing from the spirit and the scope of the 
invention. Accordingly, the invention is not limited to the precise embodiments 
shown in the drawings and described in detail hereinabove. Rather, it is intended 
1 5 that all such variations not departing from the spirit of the invention be considered as 
within the scope thereof as limited solely by the claims appended hereto. 

Also, several different embodiments of the present invention are 
described above, with each such embodiment described as including certain 
features. However, it is intended that the features described in connection with the 
20 discussion of a single embodiment are not limited to that embodiment but may be 
included and/or arranged in various combinations in any of the other embodiments 
as well, as will be understood those skilled in the art. 

In the following claims, those elements which do not include the words 
"means for" are intended not to be interpreted under 35 U.S.C. § 112 6. 
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